This file contains documentation on the CSLU Speaker Recognition Corpus, Version 1.1, Linguistic
Data Consortium (LDC) catalog number LDC2006S26 and ISBN
The Speaker Recognition corpus (formerly known
as Speaker Verification), consists of telephone speech from 91
participants. Each participant has recorded speech in twelve sessions over
a two-year period answering questions like "what is
your eye color" or responding to prompts like "describe a typical day in
your life." Most of the utterances in the release of the corpus have
corresponding non-time-aligned word level transcriptions.
In most of the CSLU data collections, each participant calls a
toll free telephone number and answers a few question. CSLU records
the speech, transcribes it, then packages it as a released corpus.
The Speaker Recognition data collection was quite a bit more
complicated. The goal of the data collection was to collect speech
from each participant over a two-year period. Each participant called
call the data collection system 12 times over the two-year period
and say the same utterances each time.
Some of the recording sessions were only a few days apart and others
several weeks apart. Participant followed the
following calling schedule. During the first month, they called twice
in a week. No calls were made in the second and third months. In the
fourth month they made one call. No calls were made in the fifth and
sixth months. This pattern repeated three more times for a total of
12 calls per participant.
In order to balance the workload required to remind participants to
call and to avoid large data collection bursts on the system, the
participants were divided into 12 groups. Each group began the
two-year schedule on subsequent months. The first group started in
September 1996. The second group started in October 1996. And so on.
For an example of the data in this corpus, please listen to the following audio sample.
Portions © 1996-2002 Center for Spoken Language Understanding,
Oregon Health & Science University, © 2006 Trustees of the University of