Description
This release of English Chinese Translation Treebank v. 1.0 consists of 146,300 words in 325 files of individual news stories from Xinhua News Agency (corresponding to the Xinhua data in Chinese Treebank 5.0, LDC Catalog No. LDC2005T01) that are translated into English, part-of-speech tagged and treebanked. The files were compressed using gzip.
The source files for the treebank annotation contain the final updated
translation of these files. Translation errors that prevented
complete treebank annotation have been corrected. This translation
and annotation were completed in October 2004 and supersede any
earlier translation.
This publication was compiled under Natinal Science Foundation Grant #IIS-0325646.
Samples
For an example of the data in this publication, please view this sample.
Copyright
Portions © 1994-1998 Xinhua News Agency, © 2004, 2007 Trustees of the University of Pennsylvania |