Introduction
This publication contains the Speech in Noisy Environments (SPINE) Evaluation
Transcripts, created for the Department of Defense (DoD) Digital Voice
Processing Consortium (DDVPC) by Arcon Corp., and produced by the Linguistic
Data Consortium (LDC) catalog number LDC2000T54 and ISBN 1-58563-189-2. A
companion corpus, Speech in Noisy Environments (SPINE) Evaluation Audio, was
also produced by the Linguistic Data Consortium (LDC); catalog number
LDC2000S96, ISBN 1-58563-188-4. These corpora support the 2000 Speech in Noisy
Environments evaluation. For an example transcript, please click
here.
The 2000 Speech in Noisy Environments Evaluation (SPINE1) is a first attempt
to assess the state of the art and practice in speech recognition technology in
noisy military environments and to exchange information on innovative speech
recognition technology in the context of fully implemented systems that perform
realistic tasks. It is intended to be of interest to all university, industrial
and commercial speech system developers working on the problem of robust speech
recognition. The evaluation gives participants the opportunity to participate
in a flexible evaluation, suited to development needs and abilities.
This work was sponsored in part by National Science Foundation Grant No. IIS-9982201.
Data
The SPINE1 evaluation focuses on the task of transcribing speech produced in
noisy environments with the emphasis on speech produced in noisy military
environments. The evaluation is designed to promote research progress in this
area, to provide the opportunity for participants to try out new ideas for
developing robust speech recognition systems that are of both scientific and
practical interest, and to measure the performance of this
technology. More information on this evaluation is available at SPINE1.
The evaluation task is to transcribe speech produced in noisy
environments. The training and test speech data to be used for this evaluation
were generated by ARCON Corp. for the DoD Digital Voice Processing Consortium
(DDVPC) under controlled conditions. The speech data consists of conversations
between two communicators working on a collaborative, Battleship-like task in
which they seek and shoot at targets (ARCON Communicability Exercise,
ACE). Participants may talk freely, but the total vocabulary used is fairly
limited. Each person is seated in a sound chamber in which a previously
recorded military background noise environment is accurately reproduced. The
participants use handsets and transmission channels that are resident to the
particular environment. The evaluation data includes 20
talker-pairs, with six five-minutes conversations per talker-pair (about 600
minutes total), from a set of four scenarios
Updates
August 13, 2001: A tagging error was discovered in which several files
containing occurrences of the incorrect tag "[{noise}]," were converted to the
correct tag, "[/noise]." There were 433 occurrences of this error across all
files. Also, a single occurrence of two instances of "[noise/]" on the same
line was corrected to "[/noise]" in the second instance. If you previously
purchased this corpus and would like to download a corrected copy please
contact ldc@ldc.upenn.edu.
Copyright |