Introduction
SWB-2 Phase II consists of 4,472 five-minute telephone
conversations involving 679 participants. This corpus
was collected by the Linguistic Data Consortium (LDC)
in support of a project on Speaker Recognition
sponsored by the U.S. Department of Defense.
Data
Speakers were solicited by the LDC to participate in
this telephone speech collection effort via the
Internet, newspaper advertisements, and personal
contacts. The majority of participants resided in
the following states:
State Number of Speakers
--------------------------
MN 156
WI 105
OH 70
IA 64
MI 41
IL 37
Participants in SWB-2 Phase II were recruited from the following
midwestern college campuses: Iowa State University, Michigan State
University, University of Michigan, University of Minnesota,
University of Wisconsin at Madison, Northwestern University, and Ohio
State University.
Each recruit was asked to participate in at least ten five-minute phone
calls. Ideally each participant would receive five calls at a
designated number and make five calls from phones with different (ANI)
codes. Participants were asked to discuss a specific topic (read by
the automated operator) and not to provide personal information during
their call.
Each of the 679 participants placed their calls via a toll-free robot
operator maintained by the LDC. Access to the robot operator was
possible via a unique Personal Identification Number (PIN) issued by
the recruiting staff at the LDC when the caller enrolled in the
project.
Upon conclusion of the study all calls were audited by LDC staff
members. Particular attention was paid to PIN verification (matching
speaker with PIN), checking call duration, and call quality. Upon
completion of this process, checks were issued and mailed to
participants. The conversations have not been transcribed.
Updates
09/29/2011: Updated the file table to accurately reflect the files now that they are on DVDs. Also, updated the readme to indicate these changes.
Copyright
Portions © 1999 Trustees of the University of Pennsylvania |