|

|
|
TI 46-Word
| |
| Item Name: | TI 46-Word |
| Authors: | Mark Liberman, Robert Amsler, Ken Church, Ed Fox, Carole Hafner, Judy Klavans, Mitch Marcus, Bob Mercer, Jan Pedersen, Paul Roossin, Don Walker, Susan Warwick and Antonio Zampolli |
| LDC Catalog No.: | LDC93S9 |
| NIST Catalog No.: | 7-1.1 |
| ISBN: | 1-58563-017-9 |
| Data Type: | speech |
| Sample Rate: | 12500 Hz |
| Sampling Format: | 1-channel 12-bit pcm |
| Data Source(s): | microphone speech |
| Application(s): | speech recognition |
| Language(s): | English |
| Language ID(s): | eng |
| Distribution: | 1 CD |
| Member fee: | $0 for 1993 members |
| Non-member Fee: | US $300.00 |
| Reduced-License Fee: | US $150.00 |
| Extra-Copy Fee: | US $150.00 |
| Non-member License: | yes |
| Readme File: | yes |
| Online documentation: | yes |
| Licensing Instructions: | Subscription Members, Standard Members, Non-Members |
| Citation: | Mark Liberman, et al. 1993 TI 46-Word Linguistic Data Consortium, Philadelphia |
|
| This CD-ROM contains a corpus of speech which was originally designed
and collected at Texas Instruments, Inc. (TI) in 1980 and used
initially in performance assessment tests of isolated-word
speaker-dependent technology. (See "Speech Recognition: Turning Theory
to Practice" by G. R. Doddington and T. B. Schalk, in IEEE Spectrum,
Vol. 18, No. 9, September 1981.)
The 46-word vocabulary consists of two sub-vocabularies: (1) the TI
20-word vocabulary (consisting of the digits zero through nine plus
the words "enter," "erase," "go," "help," "no," "rubout," "repeat,"
"stop," "start," and "yes" as well as (2) the TI 26-word "alphabet set"
(consisting of the letters "a" through "z").
The corpus contains read utterances from 16 speakers (eight males and eight
females) each speaking 26 utterances of the 46-word vocabulary: 16
tokens designated as training and ten as test.
The corpus was collected at Texas Instruments in a quiet acoustic
enclosure using an Electro-Voice RE-16 Dynamic Cardiod microphone at
12.5kHz sample rate with 12-bit quantization. The files are in NIST
SPHERE format and have a ".wav" filename extension. |
|
|