Fisher English Training Part 2 Transcripts represents the second half of a collection ofconversational telephone speech (CTS) that was created at the LDCduring 2003. It consists of transcripts for the speech contained in Fisher English Training Part 2, Speech (LDC2005S13).
The Fisher telephone conversation collection protocol was createdat LDC to address a critical need of developers trying to build robustautomatic speech recognition (ASR) systems. Previous collectionprotocols, such as CALLFRIEND and Switchboard-II and the resultingcorpora, have been adapted for ASR research but were in fact developedfor language and speaker identification respectively. Although theCALLHOME protocol and corpora were developed to support ASRtechnology, they feature small numbers of speakers making telephonecalls of relatively long duration with narrow vocabulary across thecollection. CALLHOME conversations are challengingly natural andintimate. Under the Fisher protocol, a very large number ofparticipants each make a few calls of short duration speaking to otherparticipants, whom they typically do not know, about assignedtopics. This maximizes inter-speaker variation and vocabulary breadthalthough it also increases formality.
Previous protocols such as CALLHOME, CALLFRIEND and Switchboardrelied upon participant activity to drive the collection. Fisher isunique in being platform driven rather than participantdriven. Participants who wish to initiate a call may do so howeverthe collection platform initiates the majority of calls. Participantsneed only answer their phones at the times they specified whenregistering for the study.
To encourage a broad range of vocabulary, Fisher participants areasked to speak on an assigned topic which is selected at random from alist, which changes every 24 hours and which is assigned to allsubjects paired on that day. Some topics are inherited or refined fromprevious Switchboard studies while others were developed specificallyfor the Fisher protocol.
The first half of the collection (Fisher English Training Speech,Part 1) was released by the LDC in 2004 (LDC2004S13 for speech data,LDC2004T19 for transcripts). Taken as a whole, the two parts comprise11,699 recorded telephone conversations.
The individual audio files are presented in NIST SPHERE format, andcontain two-channel mu-law sample data shorten compression has beenapplied to all files.
Data collection and transcription were sponsored by DARPA and theU.S. Department of Defense, as part of the EARS project for researchand development in automatic speech recognition.
SamplesTo see an example of this corpus, please examine this sample.
Portions © 2003-2005 Trustees of the University of Pennsylvania