Introduction
This publication contains the English evaluation test material used in
the 1999 NIST Broadcast News Transcription Evaluation administered by the
NIST,
Spoken Natural Language Processing Group and produced by the Linguistic
Data Consortium Catalog number LDC2000S88 ISBN 1-58563-176-0.
Data
The test material is contained in two SPHERE-formatted waveform
files. The file bn99en_1.sph (set1) contains 1.5 hours of Broadcast News
excerpts from last year's set2 epoch. The file bn99en_2.sph (set2)
contains 1.5 hours of Broadcast News excerpts from the summer of 1998. Each
file should be separately recognized per the Broadcast News English Evaluation
Specification.
Additional test material for each set is also
included. Test materials include evaluation map files (bn99en_1.uem), automatically generated
segmentation files (bn99en_1.seg), transcripts from the
evaluation (bn99en_1.utf) and the utf.dtd used to validate the transcripts,
reference STM files (bn99en_1.stm), and transcript orthography
mapping files (en981118.glm). For more complete
information, see the 1998 HUB4 Website.
Updates
There are no updates at this time.
Copyright
Portions Copyright 1998 PRI-Public Radio International
Portions Copyright 1997-1998 ABC News
Portions Copyright 1998 NBC News
Portions Copyright 1997-1998 Cable News Network, Inc. All Rights Reserved
Note that the waveform and transcript data on this disc are licensed
through the Linguistic Data Consortium (LDC) and are
subject to usage restrictions. Contact the LDC for
license agreement information.
Pricing
The Reduced Licensing Fee for this corpus is US$150. |