Obtaining DataUsing DataProviding DataCreating Data
About LDCMembersCatalogProjectsPapersLDC OnlineSearchContact UsUPennHome

LDC Catalog | By Type and Source | By Year | Top Ten | Projects | Catalog Search



Speech Controlled Computing

Item Name: Speech Controlled Computing
Authors: Christopher Cieri, David Miller, Nii O. Martey, and Kazuaki Maeda
LDC Catalog No.: LDC2006S30
ISBN: 1-58563-380-1
Release Date: Mar 24, 2006
Data Type: speech
Sample Rate: 48000 Hz
Sampling Format: pcm
Data Source(s): microphone speech
Application(s): machine learning
Language(s): English
Language ID(s): eng
Distribution: 5 DVD
Member fee: $0 for 2006 members
Non-member Fee: US $7000.00
Reduced-License Fee: US $3500.00
Extra-Copy Fee: US $1000.00
Non-member License: yes
Member License: yes
Online documentation: yes
Licensing Instructions: Subscription Members, Standard Members, Non-Members
Citation: Christopher Cieri, et al.
2006
Speech Controlled Computing
Linguistic Data Consortium, Philadelphia

Introduction

This file contains documentation on Speech Controlled Computing, Linguistic Data Consortium (LDC) catalog number LDC2006S30 and ISBN 1-58563-380-1.

The Speech Controlled Computing corpus was designed to support the development of small footprint, embedded ASR applications in the domain of voice control for the home. It consists of the recordings of 125 speakers of American English from four dialect regions, three age groups and two gender groups, pronouncing isolated words. The four primary dialect regions covered by the corpus are North, South, West and Midland as defined by Williams Labov's Atlas of North American English. The three primary age groups covered by the corpus are 18-29, 30-49 and 50+.

The recordings were conducted in a sound-attenuated room at LDC with the AKG C4000B studio condenser microphone. The omni-directional mode of the C4000B was used. Each speaker read a randomized word list consisting of 2,100 words (100 distinct words appearing 21 times each). Speech utterances were digitized and recorded to a DAT, as well as to a hard disk drive via the Townshend DATLINK+ digital audio interface.

Speech utterances were audited as they were recorded, and any utterances detected by the recorder that were not spoken clearly or correctly were re-recorded. This included extraneous clicks, coughs, sighs and breathing that may have corrupted the recorded words. Utterances that were spoken too soft or too loud were also re-recorded.

The digitized utterances were automatically segmented and aligned to the word list. Then each utterance was audited and the segmentation was checked, and corrected if necessary, by an annotator using an auditing and segmenting tool developed by LDC.

Finally, sound files containing individual utterances were generated using the alignment and segmentation information. The sound files for this corpus were created with 100 msec of silent time before and after each utterance. Any files that contained noticeable clipping were automatically removed.

Samples

For an example of this corpus, please listen to this audio sample

Content Copyright

© 2003-2006 Trustees of the University of Pennsylvania


About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Data

Contact: ldc@ldc.upenn.edu

(c) 1992-2010 Linguistic Data Consortium, University of Pennsylvania. All Rights Reserved.