Introduction
The CALLFRIEND project was designed to
support the development of language identification technology.
Data
The corpus consists of 60 telephone conversations,
lasting between 5-30 minutes. The corpus also includes
documentation describing speaker information (sex, age, education,
callee telephone number) and call information (channel quality, number
of speakers).
For each conversation, both the caller and callee are native speakers
of Korean. All calls are domestic and were placed inside the
continental United States and Canada.
Updates
Transcripts for 49 of the 60 calls are now available as CALLFRIEND Korean Transcripts (LDC2003T08).
An additional number of 51 calls have been published as CALLFRIEND Korean Speech Supplement (LDC2003S03).
Content Copyright
Portions © 1996 Trustees of the University of Pennsylvania |