The text component of the CALLHOME German
corpus package includes transcripts and documentation files. The
transcripts cover contiguous five or ten minute segments taken from 100 unscripted
telephone conversations between native speakers of German. The transcripts are
timestamped by speaker turn for alignment with the speech signal and are
provided in standard orthography.
In addition to transcript files, this corpus contains full documentation on
the transcription conventions and format. Complete auditing information on the
speakers represented in the transcripts (including gender, channel quality and
so on) is also included.
This corpus is distributed throughout the LDC's FTP server.
The corpus of telephone speech (LDC97S43) is
available separately, as well as an associated lexicon (LDC97L18).
For a list of updates, user reports, and other addenda, please go to LDC1997T15.
There are no updates at this time.