John, Louise, Thomas, et al.,
I'm using Transcriber (http://www.etca.fr/CTA/gip/Projets/Transcriber/) to
transcribe interview recordings. Transcriber outputs the transcription in
XML. Speakers, turns, turn times, etc. are all tagged in XML. The XML data
is synchronized to the audio so one can move to any point in the interview
transcript and play the corresponding audio.
Here's my 2 cents. It would be nice if:
1. There was a standard DTD for qualitative interview and focus group data.
As I understand it, the original motivation for Transcriber was to assist
in the generation of transcripts of French Broadcast News and the layout of
the DTD reflects this, although, up to a point, it can readily be adapted
to the transcription of research interview recordings or any type of audio.
2. Programs such as Atlas-ti were capable of using XML coded transcripts
using a standard DTD geared to qualitatively research natively. At the
moment the Transcriber generated XML data has to be exported and formatted
as a TXT file prior to data analysis. In addition to the tags for speakers,
turns, etc., one would want an extensive list of tags defining the
properties of each interviewee (or focus group participant), interviewer,
the interview (date, setting, etc.), research project title, name of
corresponding audio file, etc. Some of these might be fairly generic (e.g.
gender), some might be project specific. Ideally, the data analysis program
would automatically associate these properties to all associated data. So,
for example, if in the XML transcription data file "Interviewee A" had a
property sheet that included the gender property "female" then that
property would automatically be associated with every speech turn of that
interviewee. There would therefore be no need for elaborate schemes to code
these properties; it would just happen automatically when the primary XML
transcription data file (and accompanying audio data file - see #3) were
assigned as primary documents.
3. Programs such as Atlas-ti were capable of using the transcribed
interview in XML with the tags to the associated audio data intact so that
from within the data analysis program one would have instant access to both
the transcription and the synchronized audio (MP3, WAV, etc. audio file).
So, say you were coding a section of transcribed data and you thought there
might be an error in the transcription or maybe you wanted to hear the tone
of the speaker's voice, you'd just click a play button and the associated
audio would immediately play and the text display would scroll forward in
step with the audio.
4. There was a tool similar to Transcriber, maybe a modified version of
Transcriber, tailored to qualitative data analysis. A DTD is no use if
there isn't an easy way to use it to generate XML transcripts. We need
tools for transcription that are better integrated with qualitative data
analysis.
5. There were better tools to aid confidentiality procedures. Any efforts
towards improving qualitative interview transcription should also take into
account that in many projects one will want to assign a set of pseudonyms,
harmonized across multiple transcripts, to protect confidentiality. It
would be nice to have a transcription tool in which a database of mappings
was maintained so that, for example, in interview transcripts from a
project related to medical settings any transcript where the
reference "Royal Brompton Hospital" appears would be mapped to "Hospital
A", "Guys Hospital" to Hospital B, etc. Maybe the transcription software
would flag common real names or names already entered in the mapping
database. The software should also be able to select and blank the
corresponding section of the audio file (assuming here that to start with
there is an original file and a cleaned file).
One other thought: I'm not an XML expert but aren't DTD's old hat? Aren't
we really talking about XML Schemas? I quote from a recent W3 document
(See: http://www.w3.org/TR/xmlschema-1/): "XML Schema: Structures specifies
the XML Schema definition language, which offers facilities for describing
the structure and constraining the contents of XML 1.0 documents, including
those which exploit the XML Namespace facility. The schema language, which
is itself represented in XML 1.0 and uses namespaces, substantially
reconstructs and considerably extends the capabilities found in XML 1.0
document type definitions (DTDs)."
Alan.
|