Introduction
This publication contains the Speech in Noisy Environments (SPINE) Training
Audio Corpus created for the Department of Defense (DoD) Digital Voice
Processing Consortium (DDVPC) by Arcon Corp., and produced by the Linguistic
Data Consortium (LDC) catalog number LDC2000S87, ISBN 1-58563-173-6. A
companion corpus, Speech in Noisy Environments (SPINE) Training Transcripts,
was also produced by the Linguistic Data Consortium (LDC) catalog number
LDC2000T49 and ISBN 1-58563-174-4. These corpora support the 2000 Speech in
Noisy Environments (SPINE1) evaluation. For an example of a corresponding
transcript from the Speech in Noisy Environments (SPINE) Training Transcripts
Corpus, please click here. Due to size and format
considerations, no example of a speech file is provided.
The 2000 Speech in Noisy Environments Evaluation (SPINE1) is a first attempt
to assess the state of the art and practice in speech recognition technology in
noisy military environments and to exchange information on innovative speech
recognition technology in the context of fully implemented systems that perform
realistic tasks. It is intended to be of interest to all university, industrial
and commercial speech system developers working on the problem of robust speech
recognition. The evaluation gives participants the opportunity to participate
in a flexible evaluation, suited to development needs and abilities.
The SPINE1 evaluation focuses on the task of transcribing speech produced in
noisy environments with emphasis on noisy military environments. The evaluation
is designed to promote research progress in this area, to provide the
opportunity for participants to try out new ideas for developing robust speech
recognition systems that are of both scientific and practical interest, and to
measure the performance of this technology. More information on this
evaluation is available at SPINE1.
This work was sponsored in part by National Science Foundation Grant No. IIS-9982201.
Data
The evaluation task is to transcribe speech produced in noisy
environments. The training and test speech data to be used for this evaluation
were generated by ARCON Corp. for the DoD Digital Voice Processing Consortium
(DDVPC) under controlled conditions. The speech data consists of conversations
between two communicators working on a collaborative, Battleship-like task in
which they seek and shoot at targets (ARCON Communicability Exercise,
ACE). Participants may talk freely, but the total vocabulary used is fairly
limited. Each person is seated in a sound chamber in which a previously
recorded military background noise environment is accurately reproduced. The
participants use handsets and transmission channels that are resident to the
particular environment. The training data includes 10 of twenty available
talker pairs with 14 five-minute conversations per talker pair (about 720
minutes total), which include four noise scenarios.
Samples
For an example of the data contained in this corpus, please listen to this audio sample.
Updates
There are no updates at this time.
Copyright |