DATA.DOC
release January, 1994
The speech and transcription filenames are of the form:
<langcode><callnumber><type>.wav - compressed waveform data, with 1024
byte uncompressed SPHERE header
<langcode><callnumber><type>.seg - time-aligned broad phonetic
transcriptions
<langcode><callnumber>.log - information file containing results of
preliminary verification and evaluation
of each call
where:
<langcode> consists of the first two letters of the ten language
names. E.g. fa (Farsi), vi (Vietnamese), etc.
<callnumber> positive 3-digit integer (with leading zeros as needed)
<type> any one of:
nlg - native language
clg - common language
dow - days of the week
num - number 0 thru 10
htl - hometown likes
htc - hometown climate
roo - room description
mea - description of most recent meal
stb - free speech before the tone
sta - free speech after the tone
The stb (max. duration 50 seconds) and sta (max. 10 seconds)
utterances together form the 1-minute free speech portion of each
call. The explanation below clarifies the reasons for this breakup
into two files.
Instead of abruptly cutting off the caller at the end of 1 minute, it was decided, based on trial runs of the recording protocol, to play a "time is up" tone after the first 50 seconds, to inform the caller that (s)he had only 10 seconds left to bring his/her free speech response to a coherent end. Callers were explained the purpose of this tone and were played a sample tone before the actual 1-minute recording began. The software of the Gradient Desklab dictated that the response before and after the tone be recorded in separate files.