On Transcription [ the Joys and Sorrows of ]
First, here is a quick overview
of the professional webcred
transcripts, as initially received. (By the time you read this,
they are probably already cleaned up, so this is a moot point.
This is presented for transcription geeking value only.) More
detailed
commentary and examples at the end.
[Now I really want
to interview someone who runs a transcription business… what kind of
special effort do they take with celebrities’ names and quotes?
What other VIP services are available? What kind of insurance do they need, and what happens when you transcribe slander?
How hard is it to produce a perfect real-time transcript, and how close
to “real-time” can one get? My private hunch: as close as
needed to exactly real time, including ‘post’-processing.]
- The transcripts are beautiful. In contrast to some gov
and court transcripts I have seen, they are well laid out and easy to scan.
- The English is clean.
Interruptions and stuttering are cleanly handled, and most sessions
have useful sentence and paragraph breaks (even when the speaker was
rambling). - They get most names, proper nouns, organizations, and technology
references right. More than I would get without a list of names
in front of me. - Their attention to small connecting words and comments
under-the-breath is generally excellent. In general, their
accuracy is fabulous [which is, as I’m sure you all know by now, the Official Word of 2005], some 99.5%
- This was done quickly : a ~1 week turnaround for 15 hours of dense audio.
- They are inconsistent. In some places, [inaudible] is used and
every
speaker gets his/her own line and paragraph. In others, just a
few dashes or underscores “___” are used to indicate something
inaudible. Full names are used in some places, and not in others,
at times ambiguously. Some sessions have poor formatting,
paragraph breaks; the transcription of podcasting audio clips, for
instance.
- They get many names, proper nouns, organizations, and technology
references wrong. They should have lists of these terms in front
of them, and should ask for what they don’t have. This would help
them spell podcasting without a space, write “blogosphere” without a
cap, and remember that yes, Jon Bonn