|

|
|
YOHO Speaker Verification
| |
| Item Name: | YOHO Speaker Verification |
| Authors: | Joseph Campbell and Alan Higgins |
| LDC Catalog No.: | LDC94S16 |
| ISBN: | 1-58563-042-X |
| Data Type: | speech |
| Sample Rate: | 8000 Hz |
| Sampling Format: | 1-channel pcm compressed |
| Data Source(s): | microphone speech |
| Application(s): | speaker verification |
| Language(s): | English |
| Language ID(s): | eng |
| Distribution: | 1 CD |
| Member fee: | $0 for 1994, 1998 members |
| Non-member Fee: | US $1000.00 |
| Reduced-License Fee: | US $500.00 |
| Extra-Copy Fee: | US $150.00 |
| Non-member License: | yes |
| Online documentation: | yes |
| Licensing Instructions: | Subscription Members, Standard Members, Non-Members |
| Citation: | Joseph Campbell and Alan Higgins 1994 YOHO Speaker Verification Linguistic Data Consortium, Philadelphia |
|
| The YOHO database contains a large scale,
high-quality speech corpus to support text-dependent speaker
authentication research, such as is used in "secure access"
technology. The data was collected in 1989 by ITT under a US
Government contract, but has not been available for public use before.
Note that certain changes have been made to the corpus, mainly to
insure the privacy of the speakers and some data has been withheld by
the government for future use in testing.
YOHO contains:
- "Combination lock" phrases (e.g. 36-24-36)
- Collected over three-month period in a real-world office environment
- Four enrollment sessions per subject with 24 phrases per session
- Ten test sessions per subject with four phrases per session
- 8kHz sampling with 3.8 kHz analog bandwidth
- 1.5 gigabytes of data
The number of trials is thus sufficient to permit evaluation testing
at high confidence levels. In each session, a speaker was prompted
with a series of phrases to be read aloud; each phrase was a sequence
of three two-digit numbers (e.g. "35 - 72 - 41", pronounced
"thirty-five seventy-two forty-one"). The first four sessions for a
given speaker were enrollment sessions of 24 phrases and all
additional sessions were verification trials of four phrases each. In
all there are 552 enrollment sessions and 1,380 trial sessions, with a
nominal time interval of three days between sessions.
Updates
An update is available that corrects a bug in the original release.
Content Copyright |
|
|