The CALLFRIEND project
supports the development of language identification technology.
The corpus consists of 60 unscripted telephone conversations,
lasting between 5-30 minutes. The corpus also includes
documentation describing speaker information (sex, age, education,
callee telephone number) and call information (channel quality, number
For each conversation, both the caller and callee are native speakers
of Southern American English. All calls are domestic and were placed
inside the continental United States, Canada, Puerto Rico or the
Callers in the "Southern" collection of CALLFRIEND American
English were identified primarily on the basis of vowel quality
patterns that are common among native speakers raised in the
southeastern United States (from Texas eastward to the Atlantic coast
and from Virginia and Kentucky southward to the Gulf of Mexico). This
category also includes a small number of African-American speakers,
whose geographic origins may be more dispersed, but who share some of
the vowel quality patterns distinctive of Southern white speakers.
(Of course, other dialect features involving phonology, syntax and
prosody, serve to differentiate these two subgroups within the
There are no updates at this time.