Obtaining DataUsing DataProviding DataCreating Data
About LDCMembersCatalogProjectsPapersLDC OnlineSearchContact UsUPennHome

LDC Catalog | By Type and Source | By Year | Top Ten | Projects | Catalog Search



LDC Top Ten Corpora

These 10 LDC corpora are the most popular (number distributed is in italic)

978LDC93S1TIMIT Acoustic-Phonetic Continuous Speech Corpus
725LDC96L14CELEX2
709LDC2006T13Web 1T 5-gram Version 1
426LDC93S10TIDIGITS
388LDC94T5ECI Multilingual Text
334LDC99T42Treebank-3
323LDC93T3ATIPSTER Complete
312LDC93S2NTIMIT
264LDC94S16YOHO Speaker Verification
262LDC2001T02Message Understanding Conference (MUC) 7

About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Da ta

Contact: ldc@ldc.upenn.edu

(c) 1992-2008 Linguistic Data Consortium, University of Pennsylvania. All Rights Reserved.