Transcribing interviews as part of research used to be relatively simple. Social science has been doing this for decades – writing out what people say in order to analyse the content of conversations and interviews in fine detail. Traditionally, the written transcript is then refined for salient detail and highlights are pulled out for more concentrated attention. Where it gets complicated is when the transcript needs to reveal more than just speech or talk. Doing design research I suggest usually involves more than just talking. People may be handling objects, interacting online, using mobile technology, making things or doing other forms of creative work like drawing or mapping. This means there are multiple modes in operation and the transcript of the activity needs to be able to account for them – hence multimodal transcription.
In my research so far I have many types of activity some captured on video and some in stills. Quite often people were working together or at least talking while they were working on the task. This means the transcript needs to account for how people touched and handled the objects, how people handled and operated pencil, pen, and brush, how people negotiated the spaces and people around them, and how they explained what they did. Luckily multimodal transcription offers a way to do this. Bezemer and Mavers (2011) explain some of the key concerns when constructing a multimodal transcript. The purpose of the transcription frames how it is received and what role it plays in the corpus of knowledge. In my case the transcript serves a PhD thesis in design research in an art end design institution. The transcript serves to professionalise my research data and to establish my credentials as a researcher. It is framed by the notion of practice based design research with all the associated attention to the artefact that I have written about previously. It seems clear that the artefact is an anchoring object in the transcript around which various modes circulate. This might be the special way design research designs multimodal transcription. In the end the multimodal transcript is a form of representation and needs to be constructed as reflexively as any other design object.
Given that a video document may contain hours of footage, what is included in the transcript and what is left out is clearly very important. For example, in my research I am less interested in clothes, haircuts, and accents than I am in how people touch and draw and how they talk about what they have done. What principles are at play in what I select to include? How do those choices influence the validity of the analysis? Is the transcript aiming to be an ‘accurate’ account of what happened or a filtered interpretation? Once selections have been made for inclusion they should be highlighted for how salient they are to both the research question and to the situation. I will aim to draw readers’ attention to the things I think are important in the transcript not just by what I include but by how I highlight certain moments or interactions. In text this is often done by using bold, italic or underlined type. Kress et al. (1988, 1996) suggest five ways of doing this with images in a multimodal transcript: spatial detail through perspective arrangement, pictorial detail i.e drawing vs photos, colour used to draw focus, background can be excluded or emphasised.
Another key idea in multimodal transcription is transduction. This means the change of one mode to another necessary to make events visible and analysable. Video is represented in text, stills, or drawings, speech as text. There is always a transformational process underway which influences how meaning emerges from the transcript. As Bezemer says; video data are ‘transducted and edited representations through which analytical insights can be gained and certain details are lost’.The validity and accuracy of a transcript is judged not by how well it reproduces reality but ‘how it facilitates a particular professional vision’. Finally, how the transcription is laid out can have a profound influence on how readers interpret it and where analytical attention is drawn. Usually modes are separated out so their interdependence and relation can be seen. Axes, timelines, tables and plates come into play here as ways of organising visual and textual data on the page. The resulting composition (or design) reflects the rhetorical or theoretical objectives of the research and places it in an analytical framework.
It all sounds complicated but treating the process as consciously and creatively as the making of any design object promises to be fun : )