Introduction
Switchboard Cellular Part 2 Audio was produced by Linguistic Data
Consortium (LDC) catalog number LDC2004S07 and ISBN 1-58563-297-x.
The Switchboard Cellular Part 2 Audio collection focused primarily on cellular
phone technology (all service types). The collection commenced 09/16/2000
and was completed by 12/15/2000. The project's goal was to target
200 subjects, balanced by gender, under varied environmental conditions to
participate in (10+) five-six minute conversations on cellular phones. The
speech data was collected for research, development, and evaluation of
automatic systems for speech-to-text conversion, talker identification,
language identification and speech signal detection purposes.
Data
This release contains speech data files only, along with documentation describing
speaker information (sex, age, education, city and state where raised),
call information (date, time, call duration, Personal Identification
Numbers, topic), and audit information (channel quality, background
noise). The documentation also contains reports on clipped files.
During the collection period, the LDC collected a total of 2,020 calls, or
4,040 sides (2,950 cellular, 2,405 female, 1,635 male), from 419 participants,
under varied environmental conditions. There are a total
of 2,020 speech files for a rough total of 202 hours of audio data (a little over 11 GB).
Each speech file consists of a 1,024-byte ASCII-formatted Sphere header,
followed by two-channel interleaved mu-law sample data. The mu-law samples
represent the actual digital data transmission from the telephone service
provider (MCI), as captured separately for each side of the telephone
conversation by the LDC's telephone collection platform. The header also
indicates the caller_pin, callee_pin, topic_id, cellular service/handset
information and speaker demographic information. The data files are not compressed. Please examine this example audio file to review a sample of this corpus.
Updates
There are no updates available at this time.
Content Copyright
Portions © 2004 Trustees of the University of Pennsylvania |