|

|
|
CALLHOME Mandarin Chinese Lexicon
| |
| Item Name: | CALLHOME Mandarin Chinese Lexicon |
| Authors: | Shudong Huang, Xuejun Bian, Grace Wu and Cynthia McLemore |
| LDC Catalog No.: | LDC96L15 |
| ISBN: | 1-58563-079-9 |
| Data Type: | lexicon |
| Data Source(s): | telephone conversations |
| Project(s): | EARS, GALE, Hub5-LVCSR |
| Application(s): | speech recognition |
| Language(s): | Mandarin Chinese |
| Distribution: | Web Download |
| Member fee: | $0 for 1996, 1997 members |
| Non-member Fee: | US $2250.00 |
| Reduced-License Fee: | US $1125.00 |
| Extra-Copy Fee: | N/A |
| Non-member License: | yes |
| Member License: | yes |
| Online documentation: | yes |
| Licensing Instructions: | Subscription Members, Standard Members, Non-Members |
| Citation: | Shudong Huang, Xuejun Bian, Grace Wu and Cynthia McLemore 1996 CALLHOME Mandarin Chinese Lexicon Linguistic Data Consortium, Philadelphia |
|
| The CALLHOME Mandarin
Chinese collection includes a lexical component. The CALLHOME
Mandarin Lexicon consists of 44,405 words and contains separate
information fields with phonological, morphological and frequency
information for each word.
The token coverage by the LDC Mandarin lexicon of words
occurring in the 20 LDC Mandarin CALLHOME devtest transcripts (ten
minutes of conversation each) is 98%.
Orthographic Chinese characters are GB-encoded and are simplified
in the Mainland style. A representation of the headword in tone pinyin with
strictly lexical tone, i.e. not reflecting phonetic/phonological
processes is also provided.
Here is a sample page from the lexicon.
The transcripts and documentation (LDC96T16) are
available separately, as is a corpus of telephone speech (LDC96S34).
Content Copyright |
|
|