What's New! What's Free!
OLAC Search - search for
language resources from dozens of language data centers and language archives.
Release of AGLIB 2.0! - new version of software infrastructure for linguistic
annotation.
MIXER
participants make 12 calls for PAY at
$6/call plus the possibility of multiple $50 bonuses!
Member Resources Page!
The LDC has new and improved resources for
members and membership info. Please check it out!
What's New Archive
New Corpora
Chinese Treebank Version 4.0 ~ 400K words with syntactic bracketing
Klex: Finite-State Lexical Transducer for Korean ~ for morphological analysis and generation applications
Morphologically Annotated Korean Text ~ annotated morphological analysis and part-of-speech tags
TIDES Extraction (ACE) 2003 Multilingual Training Data~ English, Chinese, and Arabic news text annotated for entities and relations
Arabic Treebank: Part 2 v 2.0 ~ 140K words of annotated Arabic newswire text
New Corpora Archive
ACL Anthology ~ A Digital Archive of Research Papers in Computational Linguistics, hosted at the LDC
The Linguistic Data Consortium supports language-related education, research
and technology development by creating and sharing linguistic resources:
data, tools and standards.
Contact ldc@ldc.upenn.edu
![]()

LDC is supported in part by grant IRI-9528587 from the Information and Intelligent
Systems division and grant 9982201 from the Human Computer Interaction Program of the
National Science Foundation.
LDC's corpus creation efforts are powered in part by Academic Equipment Grant 7826-990
237-US from Sun Microsystems.
Last modified:
© 1996-2000 Linguistic Data Consortium, University of Pennsylvania. All RIGHTS Reserved.