Obtaining DataUsing DataProviding DataCreating Data
About LDCMembersCatalogProjectsPapersLDC OnlineSearchContact UsUPennHome

LDC Catalog | By Type and Source | By Year | Top Ten | Projects | Catalog Search



TIDIGITS

Item Name: TIDIGITS
Authors: R. Gary Leonard and George Doddington
LDC Catalog No.: LDC93S10
ISBN: 1-58563-018-7
Data Type: speech
Sample Rate: 20000 Hz
Sampling Format: 1-channel pcm compressed
Data Source(s): microphone speech
Application(s): speech recognition
Language(s): English
Language ID(s): eng
Distribution: 1 DVD
Member fee: $0 for 1993 members
Non-member Fee: US $500.00
Reduced-License Fee: US $250.00
Extra-Copy Fee: US $200.00
Non-member License: yes
Online documentation: yes
Licensing Instructions: Subscription Members, Standard Members, Non-Members
Citation: R. Gary Leonard and George Doddington
1993
TIDIGITS
Linguistic Data Consortium, Philadelphia

This two-disc set contains speech which was originally designed and collected at Texas Instruments, Inc. (TI) for the purpose of designing and evaluating algorithms for speaker-independent recognition of connected digit sequences. There are 326 speakers (111 men, 114 women, 50 boys and 51 girls) each pronouncing 77 digit sequences. Each speaker group is partitioned into test and training subsets.

The corpus was collected at TI in 1982 in a quiet acoustic enclosure using an Electro-Voice RE-16 Dynamic Cardiod microphone, digitized at 20kHz. The waveform files are in the NIST SPHERE format.

Content Copyright

Portions © 1993 Trustees of the University of Pennsylvania


About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Data

Contact: ldc@ldc.upenn.edu

(c) 1992-2010 Linguistic Data Consortium, University of Pennsylvania. All Rights Reserved.