Introduction
2005 Spring NIST Rich Transcription (RT-05S) Conference Meeting Evaluation Set was developed by LDC
and NIST (National Institute of Standards and Technology). It contains approximately
78 hours of English meeting speech, reference transcripts and other material
used in the RT
Spring 2005 evaluation. Rich Transcription (RT) is broadly defined as a
fusion of speech-to-text (STT) technology and metadata extraction technologies
providing the bases for the generation of more usable transcriptions of human-human
speech in meetings. LDC has also released 2004
Spring NIST Rich Transcription (RT-04S) Development Data LDC2007S11 and
2004
Spring NIST Rich Transcription (RT-04S) Evaluation Data LDC2007S12.
RT-05S included the following tasks in the meeting domain:
- Speech-To-Text (STT) -convert spoken words into streams of text
- Speaker Diarization (SPKR) -find the segments of time within a meeting in which each meeting
participant is talking
- Speech Activity Detection (SAD) - detect when someone in a meeting space
is talking
Further information about the evaluation is available on the RT-05Spring Evaluation
Website.
Data Description
The data in this release consists of portions of meeting speech collected between
2001 and 2005 by the IDIAP Research Institutes Augmented Multi-Party Interaction project (AMI),
Martigny, Switzerland International Computer Science Institute
(ICSI) at University of California, Berkeley Interactive Systems Laboratories
(ISL) at Carnegie Mellon University (CMU), Pittsburgh, PA NIST and
Virginia Polytechnic Institute and State University (VT), Blacksburg, VA. Each meeting excerpt
contains a head-mic recording for each subject and one or more distant microphone
recordings.
Reference transcripts for the evaluation excerpts were prepared by LDC
according to its Meeting
Recording Careful Transcription Guidelines. Those specifications are designed
to provide an accurate, verbatim (word-for-word) transcription, time-aligned
with the audio file and including the identification of additional audio and
speech signals with special mark-up.
Samples
For an example of the data contained in this corpus, review this audio sample.
Content Copyright
Portions © 2011 Trustees of the University of Pennsylvania
|