Introduction
The Translanguage English Database (TED) Transcripts project, Linguistic
Data Consortium (LDC) catalog number LDC2002T03 and ISBN 1-58563-202-3 is a
joint publication between the European Language Resources Association
(ELRA) and the LDC. Joint LDC/ELRA distribution of this work was sponsored in part by National Science Foundation Grant No. IIS-9982201.
Data
The 39 audio files transcribed are a subset of
the 188 speeches available in the corresponding audio publication, which is
available as LDC2002S04, an LDC release of the ELRA TED corpus of recordings made at Eurospeech '93 in
Berlin. The TED audio recordings have non-native English speakers presenting
academic papers for approximately 15 minutes each. Included on the TED audio
publication are the papers, poster sessions, and original transcripts of oral
recordings for a subset of the presentations.
The 39 transcripts in this publication are in Universal
Transcription Format (UTF) and were prepared by the LDC. All utf files in the
transcript publication were validated against an the utf.dtd. Tables containing
speaker demographic information and cross-reference of file names from the TED
audio corpus are included. Please go here for a sample
of one of the transcripts.
Please note that poster presentations, notes and questionnaires are not
available for every author in the corresponding TED audio publication LDC2002S04.
Updates
There are no updates at this time
Content Copyright
Portions © 1993-2002 University of Munich, Germany; LIMSI-CNRS, France;
ELRA; and the Trustees of the University of Pennsylvania |