Introduction
The CALLFRIEND project
supports the development of language identification technology.
Data
The corpus consists of 60 unscripted telephone conversations,
lasting between 5-30 minutes. The corpus also includes
documentation describing speaker information (sex, age, education,
callee telephone number) and call information (channel quality, number
of speakers).
For each conversation, both the caller and callee are native speakers
of Spanish from non-Caribbean countries. All calls are domestic and
were placed inside the continental United States, Canada, Puerto Rico,
or the Dominican Republic.
Conversations were labeled as either "Caribbean" or "non-Caribbean"
based on particular attributes in the speech of the participants.
Callers in the "Caribbean" and "non-Caribbean" collections of
CALLFRIEND Spanish were identified primarily on the basis of consonant
quality patterns, specifically, word-final "s."
Updates
There are no updates at this time.
Content Copyright
Portions © 1996 Trustees of the University of Pennsylvania |