This file contains documentation for CSLU: Alphadigit
Version 1.3 , Linguistic Data Consortium (LDC) catalog number LDC2008S06
and isbn 1-58563-478-6.
Alphadigit Version 1.3 is a collection of 78,044 utterances from 3,025 speakers
saying six-digit strings of letters and digits over the telephone for a total
of approximately 82 hours of speech. Each speech file has corresponding orthographic
and phonemic transcriptions. This corpus was created by the Center for Spoken
Language Understanding (CSLU), Oregon Health & Science University, Beaverton,
Speakers were recruited using USEnet postings. Respondents registered for the
collection by completing an online form. Once registered, they received a list
of 18-29 six-digit strings (e.g., "a 2 b 4 5 g") and participation
instructions. Speakers called the CSLU data collection system by dialing a toll-free
number and were prompted for each string; 1102 different strings were used throughout
the course of the data collection. The lists were set up to balance for phonetic
context between all letter and digit pairs.
The data were recorded directly from a digital phone line without digital-to-analog
or analog-to-digital conversion at the recording end using the CSLU T1 digital
data collection system. The sampling rate was 8khz and the files were stored
in 8-bit mu-law format on a UNIX file system. The files have been converted
to RIFF standard file format, 16-bit linearly encoded.
All of the files included in this corpus have corresponding non-time-aligned
word-level transcriptions and time aligned phoneme-level transcriptions (automatic
forced alignment) that comply with the conventions in the CSLU Labeling Guide.
Non time-aligned orthographic transcriptions provide quick access to the content
of an utterance; they may contain markers for word boundaries to support access
and retrieval at the lexical level. Phonetic/phonemic transcriptions represent
the phonetic content of an utterance at a given level of detail that is made
explicit by the use of diacritics. Phonetic phenomena transcribed include excessive
nasalization, glottalization, frication on a stop, centralization, lateralization,
rounding and palatalization.
For an example of the speech contained in this corpus, please listen to this audio sample (MS wave).
Portions © 2000-2002 Center for Spoken Language Understanding, Oregon
Health & Science University, © 2008 Trustees of the University of Pennsylvania