Obtaining DataUsing DataProviding DataCreating Data
About LDCMembersCatalogProjectsPapersLDC OnlineSearchContact UsUPennHome

LDC Catalog | By Type and Source | By Year | Top Ten | Projects | Catalog Search



1998 HUB4 Broadcast News Evaluation English Test Material

Item Name: 1998 HUB4 Broadcast News Evaluation English Test Material
Authors: .
LDC Catalog No.: LDC2000S86
ISBN: 1-58563-172-8
Data Type: speech
Data Source(s): broadcast news
Project(s): Hub4
Application(s): speech recognition
Language(s): English
Language ID(s): eng
Distribution: 1 CD
Member fee: $0 for 2000 members
Non-member Fee: N/A (Members Only)
Reduced-License Fee: N/A
Extra-Copy Fee: US$150.00
Member License: yes
Online documentation: yes
Licensing Instructions: Subscription Members, Standard Members, Non-Members
Citation: .
2000
1998 HUB4 Broadcast News Evaluation English Test Material
Linguistic Data Consortium, Philadelphia

Introduction

This publication contains the evaluation test material used in the 1998 DARPA/NIST Continuous Speech Recognition Broadcast News HUB4 English Benchmark Test administered by the NIST Spoken Natural Language Processing Group and produced by the Linguistic Data Consortium (LDC), catalog number LDC2000S86, ISBN 1-58563-172-8.

Data

The test material is contained in two SPHERE-formatted waveform files. The file h4e_98_1.sph (set1) contains 1.5 hours of Broadcast News excerpts from 1996. The file h4e_98_2.sph (set2) contains 1.5 hours of Broadcast News excerpts from 1998. Each file should be separately recognized per the HUB4 English Evaluation Specification.

In addition, the transcripts from the evaluation and the utf.dtd used to validate the transcripts is now included. For more complete information, see the 1998 HUB4 Website.

Note: This publication does not contain the Human Reference and Baseline Recognizer transcripts for the Information Extraction - Named Entity (IE-NE) Spoke. This material was released separately prior to the start of the IE-NE Spoke.

Note: This publication does not contain the material for the HUB4 Non-English evaluation. It will be released separately.

Updates

There are no updates at this time.

Copyright

Portions Copyright 1996 by PRI-Public Radio International
Portions Copyright 1996 by ABC News
Portions Copyright 1996 Cable News Network, Inc. All Rights Reserved.
Restricted Rights Legend: Information from the USC program 'Marketplace' contained herein is the property of USC Radio and the University of Southern California and is protected by copyright. Use, duplication or disclosure by you is subject to the restrictions set forth in the user agreement and attached to the computer readable media provided to you by the Linguistic Data Consortium of the University of Pennsylvania. Copyright 1996 University of Southern California. all Rights Reserved. Marketplace is produced by USC Radio at the University of Southern Califnoria, and is distributed to public Radio stations nationwide by PRI-Public Radio International. Marketplace is made possible by GE, the Corporation for Public Radio, and Public Radio Stations nationwide.

Note that the waveform and transcript data on this disc are licensed through the Linguistic Data Consortium (LDC) and are subject to usage restrictions. Contact the LDC for license agreement information.

Pricing

The Reduced Licensing Fee for this corpus is US$150.


About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Da ta

Contact: ldc@ldc.upenn.edu

(c) 1992-2008 Linguistic Data Consortium, University of Pennsylvania. All Rights Reserved.