The CALLFRIEND project
supports the development of language identification technology.
The corpus consists of 60 unscripted telephone conversations,
lasting between 5-30 minutes. The corpus also includes
documentation describing speaker information (sex, age, education,
callee telephone number) and call information (channel quality, number
For each conversation, both the caller and callee are native speakers
of non-Southern dialects of American English. All calls are domestic
and were placed inside the continental United States, Canada, Puerto
Rico, or the Dominican Republic.
Callers in the "non-Southern" (or "general") collection of
CALLFRIEND American English appear to come from a wide geographic
range, based on their own reports of where they were raised (some
identified their origins as being in the southeastern U.S.).
Regardless of their geographic or ethnic backgrounds, the feature they
share is the clear absence of a vowel quality pattern that would
distinguish them as speakers of a "Southern" dialect.
Some information was inadvertently left out of the speaker
information table and the call information table. Copies of these
files are available here at CALLINFO.TBL and SPKRINFO.TBL.
There are no updates at this time.