Home | Linguistic Data Consortium

Accessing LDC Data

LDC data is available to everyone under a variety of license options. Our end-to-end e-commerce application makes it easy to register for an LDC user account, search the Catalog, select data and download it.

Catalog Highlights

The Catalog is a permanent language resources archive. Its gold standard, benchmark corpora have shaped the human language technology field and continue to impact research and technology development in the AI era. There’s something for everyone in the Catalog’s broad array of data sets across languages, genres and domains.

A Closer Look at LDC

LDC is more than the Catalog. It’s a hub that supports the community through collaboration, research, data management, and additional services.