Language Resources

Language resources are the collective materials used by those engaged in language-related education, research and technology development. Spanning data collections, corpora, software, research papers and specifications, these vital tools aid and inspire scientific progress.

The Data pages represent the heart of LDC's mission to make language resources available to the broad community. Obtaining Data describes how to license data from LDC's growing online Catalog. LDC Online contains portions of the Consortium's data collections in a searchable format. LDC members have full access to this database; guest accounts are also available for nonmembers. Students pursuing studies in the field can submit an application to the semi-annual Data Scholarships program.

LDC's Tools pages highlight software developed and distributed by LDC. All tools are available at no cost under an open source license. License details may vary.

The Papers page contains publications by LDC staff about their work and issues of interest in the field and includes papers and slides presented at conferences, journal publications and books.

LDC's LR Wiki catalogs data, software, descriptive grammars and other resources for a variety of languages but especially those with a paucity of generally available resources for research. The wiki contains resource listings for Bengali, Berber, Breton, Ewe, Greek (Ancient), Indonesian, Hindi, Latin, Panjabi, Pashto, Sorani (Central Kurdish), Russian, Tagalog, Tamil and Urdu, and for the following Sign Languages: American, British, Catalan, Dutch, Flemish, German, Japanese, New Zealand, Polish, Spanish and Swiss German.