must communicate transgender visibility to international audiences while navigating domestic censorship and
cultural resistance.
Although research on transgender representation in South Asian cinema has grown in recent years, three critical
gaps remain. First, much of the existing work prioritizes narrative and discourse, with comparatively little
attention paid to the visual dimensions of representation such as cinematography, lighting, and spatial design.
Second, academic focus has been disproportionately centered on Bollywood, leaving Pakistani cinema
underexplored despite its emerging importance. Third, there has been limited application of semiotics and
performativity frameworks, tools essential for decoding how transgender identities are visually constructed on
screen. This study addresses these gaps by examining Joyland (2022) exclusively through its visual strategies,
building an academic bridge between film studies, gender studies, and South Asian cultural analysis.
METHODOLOGY
This study adopts a qualitative research design, focusing on visual-textual analysis of Joyland (2022). Visual
analysis is a widely recognized approach in film and cultural studies because it enables scholars to examine how
meaning is constructed not only through narrative but through visual codes such as cinematography, lighting,
costume, body language, and set design (Rose, 2016). Given that the research question is specifically concerned
with how transgender identities are visually represented, a qualitative framework is most appropriate.
The study does not attempt to quantify representation (e.g., screen time or frequency counts). Instead, it
prioritizes depth over breadth, offering interpretive insights into the film’s visual grammar (Kress & van
Leeuwen, 2006). This approach aligns with semiotic analysis traditions (Barthes, 1977) and critical cultural
methodologies that emphasize contextual meaning-making (Hall, 1997).
The focus of the analysis in this study is the visual sequence (Kress & van Leeuwen, 2006), defined as a
continuous segment of the film where visual strategiessuch as framing, lighting, costume, gesture, and spatial
design cohere to construct meaning. This may range from a single shot to a short montage. By concentrating on
visual sequences, the study enables a detailed exploration of how transgender identity is framed without being
constrained by the overall narrative structure. The analysis deliberately excludes dialogue, textual content, lyrics,
or musical elements, focusing solely on visual and compositional aspects of the cinematic image. Guided by two
interrelated theoretical perspectives, multimodal discourse analysis (Kress & van Leeuwen, 2006) and
multimodal film analysis (Bateman & Schmidt, 2012), this study examines how meaning is generated through
the interaction of visual modes. Kress and van Leeuwen’s framework provides tools to interpret how visual
resources such as composition, gaze, color, and spatial organization function as semiotic systems, while Bateman
and Schmidt’s model extends these principles into film analysis by emphasizing how shots, sequences, and
editing operate as multimodal ensembles. Together, these perspectives enable a nuanced interpretation of how
transgender identity is visually constructed through cinematic grammar and multimodal design.
The data for this study were gathered through an in-depth visual analysis of Joyland (2022), directed by Saim
Sadiq. The film was viewed multiple times to identify sequences that explicitly or implicitly engage with themes
of transgender identity and gender performance. A total of eight key visual scenes were purposively selected
based on their narrative and aesthetic significance particularly those that foreground the transgender character,
Biba, in relation to camera movement, framing, costume, makeup, lighting, color palette, and spatial
composition. Each selected sequence was transcribed into a detailed shot-by-shot account, focusing exclusively
on visual and compositional features while deliberately excluding dialogue, lyrics, and musical elements. The
visual data were then coded manually using an interpretive framework informed by Kress and van Leeuwen’s
(2006) multimodal discourse theory and Bateman and Schmidt’s (2012) multimodal film analysis, which
together provide tools for understanding how meaning is produced through the orchestration of multiple visual
modes. Coding categories included framing, camera angle, body language, facial expression, costume, makeup,
lighting, color palette, and spatial design, allowing patterns of representation to emerge inductively across
scenes. This systematic multimodal coding process ensured analytical depth and consistency in exploring how
Joyland constructs transgender identity through its cinematic and visual grammar.