Introduction
Switchboard-2 Phase I consists of 3,638 5-minute telephone
conversations involving 657 participants. This corpus was collected by
the Linguistic Data Consortium (LDC), in support of a project on
Speaker Recognition sponsored by the U.S. Department of
Defense. This release consists of speech files only; these calls
were not transcribed.
Data
Speakers were solicited by the LDC to participate in this telephone
speech collection effort via the internet, publications
(advertisements) and personal contacts. Potential participants
responded from all areas of the United States, although the majority
of the subjects were from the Mid-Atlantic area: (PA=303), (NJ=116),
(NY=53), (DE=13), (CT=12), (MD=14), (OH=13) and (MA=8). Most of the
participants in SWB-2 Phase I were college students from the following
universities: Penn State University, University of Delaware,
University of Pennsylvania, Drexel University and Rutgers University.
Of the 657 participants, 358 were female and 299 were male. An LDC
recruiter asked all participants for the following demographic
information: age, sex, years of completed education, country of birth,
city and state where raised.
Each recruit was asked to participate in at least ten five-minute phone
calls. Ideally each participant would receive five calls at a designated
number and make five calls from phones with different telephone numbers
(ANI codes). The average subject participated in 11 conversations;
however, one gentleman participated in 64 calls. A suggested topic of
discussion was given (read by the automated operator), although
participants could chat about whatever they preferred.
Each of the 657 participants placed their calls via a toll-free robot
operator maintained by the LDC. Access to the robot operator was
possible via a unique Personal Identification Number (PIN) issued by
the recruiting staff at the LDC when the caller enrolled in the
project.
Upon conclusion of the study all calls were audited by LDC staff
members. Particular attention was paid to PIN verification (matching
speaker with PIN), checking call duration and call quality. Upon
completion of this process checks were issued and mailed to
participants.
Updates
09/29/2011: Added a file list, available through online docs, to reflect it's release on DVD. Also, an updated readme reflecting these changes.
Copyright
Portions © 1998 Trustees of the University of Pennsylvania |