Introduction
West Point Brazilian Portuguese Speech is a database of digital recordings
of spoken Brazilian Portuguese designed and collected by staff and faculty of
the Department of Foreign Languages (DFL) and Center for Technology Enhanced
Language Learning (CTELL) to develop acoustic models for speech recognition
systems. The U.S. government uses such systems to provide speech-recognition
enhanced language learning courseware to government linguists and students enrolled
in various government language programs.
The data in this corpus was collected in March 1999 in Brasilia, Brazil using
informants from a Brazilian military academy. The corpus consists of read speech
from 60 female and 68 male native and non-native speakers.
The speech was elicited from a prompt script containing 296 sentences and phrases
typically used in language learning situations. The prompts are listed in the
file prompts.txt. Each line of this file has two fields separated by a tab:
the first field denotes the base name of the waveform file; and the second field
denotes the prompt used to record the utterance.
A pronouncing dictionary developed by Dr. Sheila Ackerlind with help from cadet
Sterling Packer is provided in the file SANTIAGO.txt.
The speech was collected using four laptop computers running MS Windows. Three
of the computers recorded with a 16 bit data size and sampling rate of 22050
Hz, the other laptop recorded with an 8 bit data size at a sampling rate of
11025 Hz. The recording script presented a visual display of the sentence to
be recorded. The informant pressed a key and spoke the sentence. The recording
was played back for review, allowing the utterance to be re-recorded. A member
of the data collection team was present during the recording session to verify
recordings and to provide technical assistance in case of malfunctioning equipment.
Samples
For an example of speech contained in this corpus, please listen to this audio sample (MS Wave format).
Copyright
Portions © 1999, 2004 United States Military Academy, © 2008 Trustees
of the University of Pennsylvania |