|

|
|
Mandarin Chinese News Text
| |
| Item Name: | Mandarin Chinese News Text |
| Authors: | Zhibiao Wu |
| LDC Catalog No.: | LDC95T13 |
| ISBN: | 1-58563-052-7 |
| Data Type: | text |
| Data Source(s): | newswire |
| Project(s): | EARS, GALE, TIDES, Tipster, TREC |
| Application(s): | information retrieval, language modeling |
| Language(s): | Mandarin Chinese |
| Distribution: | 1 CD |
| Member fee: | $0 for 1995, 1996, 1997 members |
| Non-member Fee: | US $500.00 |
| Reduced-License Fee: | US $250.00 |
| Extra-Copy Fee: | US $150.00 |
| Non-member License: | yes |
| Member License: | yes |
| Online documentation: | yes |
| Licensing Instructions: | Subscription Members, Standard Members, Non-Members |
| Citation: | Zhibiao Wu 1995 Mandarin Chinese News Text Linguistic Data Consortium, Philadelphia |
|
| The Linguistic Data Consortium (LDC) announces the availability of a
Mandarin Chinese text corpus. This corpus includes about 250 million
GB-encoded text characters.
The Mandarin News Corpus includes text from various journalistic sources:
- newspaper text from Renmin Ribao (People's Daily)
- radio scripts from China Radio International
- newswire text from Xinhua newswire service
The format of this corpus uses a labeled bracketing, expressed in the
style of SGML (Standard Generalized Markup Language). The header
fields provided by the sources, which give information such as
topic, date and article ID, have been retained. The articles cover a
variety of topics, including international and domestic news, sports
and culture.
Content Copyright |
|
|