Introduction
This publication contains the Speech in Noisy Environments (SPINE) Training
Transcripts, created for the Department of Defense (DoD) Digital Voice
Processing Consortium (DDVPC) by Arcon Corp., and produced by the Linguistic
Data Consortium (LDC) catalog number LDC2000T49 and ISBN 1-58563-174-4. A
companion corpus, Speech in Noisy Environments (SPINE) Training Audio, was
also produced by the Linguistic Data Consortium (LDC); catalog number
LDC2000S87, ISBN 1-58563-173-6. These corpora support the 2000 Speech in Noisy
Environments evaluation. For an example transcript, please click
here.
The 2000 Speech in Noisy Environments Evaluation (SPINE1) is a first attempt
to assess the state of the art and practice in speech recognition technology in
noisy military environments and to exchange information on innovative speech
recognition technology in the context of fully implemented systems that perform
realistic tasks. It is intended to be of interest to all university, industrial
and commercial speech system developers working on the problem of robust speech
recognition. The evaluation gives participants the opportunity to participate
in a flexible evaluation, suited to development needs and abilities.
The SPINE1 evaluation focuses on the task of transcribing speech produced in
noisy environments with the emphasis on speech produced in noisy military
environments. The evaluation is designed to promote research progress in this
area, to provide the opportunity for participants to try out new ideas for
developing robust speech recognition systems that are of both scientific and
practical interest, and to measure the performance of this technology. More
information on this evaluation is available at SPINE1.
This work was sponsored in part by National Science Foundation Grant No. IIS-9982201.
Data
The evaluation task is to transcribe speech produced in noisy
environments. The training and test speech data to be used for this evaluation
were generated by ARCON Corp. for the DoD Digital Voice Processing Consortium
(DDVPC) under controlled conditions. The speech data consists of conversations
between two communicators working on a collaborative, Battleship-like task in
which they seek and shoot at targets (ARCON Communicability Exercise,
ACE). Participants may talk freely, but the total vocabulary used is fairly
limited. Each person is seated in a sound chamber in which a previously
recorded military background noise environment is accurately reproduced. The
participants use handsets and transmission channels that are resident to the
particular environment. The training data includes 10 of 20 available talker
pairs with 14 five-minute conversations per talker pair (about 720 minutes
total) available, which include four noise scenarios.
Updates
There are no updates at this time.
Copyright |