2008 [ top ] |
| LDC2008T03 | ACE 2005 English SpatialML Annotations |
| LDC2008L01 | An English Dictionary of the Tamil Verb |
| LDC2008S02 | CSLU: National Cellular Telephone Speech Release 2.3 |
| LDC2008S01 | CSLU: Portland Cellular Telephone Speech Version 1.3 |
| LDC2008T02 | GALE Phase 1 Arabic Blog Parallel Text |
| LDC2008T06 | GALE Phase 1 Chinese Blog Parallel Text |
| LDC2008T01 | Hungarian-English Parallel Text, Version 1.0 |
| LDC2008T04 | OntoNotes Release 2.0 |
| LDC2008T05 | Penn Discourse Treebank Version 2.0 |
| LDC2008S03 | STC-TIMIT 1.0 |
2007 [ top ] |
| LDC2007T22 | 2001 Topic Annotated Enron Email Data Set |
| LDC2007S10 | 2003 NIST Rich Transcription Evaluation Data |
| LDC2007S12 | 2004 Spring NIST Rich Transcription (RT-04S) Evaluation Data |
| LDC2007S11 | 2004 Spring NIST Rich Transcription (RT-04S) Development Data |
| LDC2007T40 | Arabic Gigaword Third Edition |
| LDC2007S03 | ARL Urdu Speech Database, Training Data |
| LDC2007T38 | Chinese Gigaword Third Edition |
| LDC2007T36 | Chinese Treebank 6.0 |
| LDC2007S08 | CSLU: Foreign Accented English Release 1.2 |
| LDC2007S18 | CSLU: Kids` Speech Version 1.1 |
| LDC2007S13 | CSLU: Apple Words and Phrases |
| LDC2007S05 | CSLU: Yes/No Version 1.2 |
| LDC2007T02 | English Chinese Translation Treebank v 1.0 |
| LDC2007T07 | English Gigaword Third Edition |
| LDC2007S02 | Fisher Levantine Arabic Conversational Telephone Speech |
| LDC2007T04 | Fisher Levantine Arabic Conversational Telephone Speech, Transcripts |
| LDC2007T24 | GALE Phase 1 Arabic Broadcast News Parallel Text - Part 1 |
| LDC2007T23 | GALE Phase 1 Chinese Broadcast News Parallel Text - Part 1 |
| LDC2007T20 | GALE Phase 1 Distillation Training |
| LDC2007T08 | ISI Arabic-English Automatically Extracted Parallel Text |
| LDC2007T09 | ISI Chinese-English Automatically Extracted Parallel Text |
| LDC2007S01 | Levantine Arabic Conversational Telephone Speech |
| LDC2007T01 | Levantine Arabic Conversational Telephone Speech, Transcripts |
| LDC2007S09 | Mandarin Affective Speech |
| LDC2007T19 | MITRE 1997 Mandarin Broadcast News Speech Translations(Hub-4NE) |
| LDC2007S15 | Nationwide Speech Project |
| LDC2007T21 | OntoNotes v 1.0 |
| LDC2007T03 | Tagged Chinese Gigaword |
| LDC2007V02 | TRECVID 2003 Keyframes & Transcripts |
| LDC2007V01 | TRECVID 2005 Keyframes & Transcripts |
2006 [ top ] |
| LDC2006S44 | 2004 NIST Speaker Recognition Evaluation |
| LDC2006T06 | ACE 2005 Multilingual Training Corpus |
| LDC2006S46 | Arabic Broadcast News Speech |
| LDC2006T20 | Arabic Broadcast News Transcripts |
| LDC2006T02 | Arabic Gigaword Second Edition |
| LDC2006S15 | CSLU: Spelled and Spoken Words |
| LDC2006S14 | CSLU: Stories v 1.2 |
| LDC2006S35 | CSLU: Multilanguage Telephone Speech Version 1.2 |
| LDC2006S39 | CSLU: Names Release 1.3 |
| LDC2006S26 | CSLU: Speaker Recognition Version 1.1 |
| LDC2006S16 | CSLU: Spoltech Brazilian Portuguese Version 1.0 |
| LDC2006S01 | CSLU: Voices |
| LDC2006T10 | English-Arabic Treebank v 1.0 |
| LDC2006T17 | French Gigaword First Edition |
| LDC2006S43 | Gulf Arabic Conversational Telephone Speech |
| LDC2006T15 | Gulf Arabic Conversational Telephone Speech, Transcripts |
| LDC2006S45 | Iraqi Arabic Conversational Telephone Speech |
| LDC2006T16 | Iraqi Arabic Conversational Telephone Speech, Transcripts |
| LDC2006S42 | Korean Broadcast News Speech |
| LDC2006T14 | Korean Broadcast News Transcripts |
| LDC2006T03 | Korean Propbank |
| LDC2006T09 | Korean Treebank Annotations Version 2.0 |
| LDC2006S29 | Levantine Arabic QT Training Data Set 5, Speech |
| LDC2006T07 | Levantine Arabic QT Training Data Set 5, Transcripts |
| LDC2006S33 | Middle East Technical University Turkish Microphone Speech v 1.0 |
| LDC2006T04 | Multiple-Translation Chinese (MTC) Part 4 |
| LDC2006S13 | N4 NATO Native and Non-Native Speech |
| LDC2006S31 | NIST 2003 Language Recognition Evaluation |
| LDC2006T01 | Prague Dependency Treebank 2.0 |
| LDC2006S34 | Russian through Switched Telephone Network (RuSTeN) |
| LDC2006T12 | Spanish Gigaword First Edition |
| LDC2006S30 | Speech Controlled Computing |
| LDC2006T18 | TDT5 Multilingual Text |
| LDC2006T19 | TDT5 Topics and Annotations |
| LDC2006T08 | TimeBank 1.2 |
| LDC2006T13 | Web 1T 5-gram Version 1 |
| LDC2006S37 | West Point Heroico Spanish Speech |
| LDC2006S36 | West Point Korean Speech |
2005 [ top ] |
| LDC2005T09 | ACE 2004 Multilingual Training Corpus |
| LDC2005T07 | ACE Time Normalization (TERN) 2004 English Training Data v 1.0 |
| LDC2005T35 | American National Corpus (ANC) Second Release |
| LDC2005S07 | Arabic CTS Levantine Fisher Training Data Set 3, Speech |
| LDC2005T03 | Arabic CTS Levantine Fisher Training Data Set 3, Transcripts |
| LDC2005T02 | Arabic Treebank: Part 1 v 3.0 (POS with full vocalization + syntactic analysis) |
| LDC2005T20 | Arabic Treebank: Part 3 (full corpus) v 2.0 (MPG + Syntactic Analysis) |
| LDC2005T30 | Arabic Treebank: Part 4 v 1.0 (MPG Annotation) |
| LDC2005S22 | Articulation Index |
| LDC2005T33 | BBN Pronoun Coreference and Entity Type Corpus |
| LDC2005S08 | BBN/AUB DARPA Babylon Levantine Arabic Speech and Transcripts |
| LDC2005T13 | CCGbank |
| LDC2005T34 | Chinese <-> English Name Entity Lists v 1.0 |
| LDC2005T10 | Chinese English News Magazine Parallel Text |
| LDC2005T14 | Chinese Gigaword Second Edition |
| LDC2005T06 | Chinese News Translation Text Part 1 |
| LDC2005T23 | Chinese Proposition Bank 1.0 |
| LDC2005T01 | Chinese Treebank 5.0 |
| LDC2005T01U01 | Chinese Treebank 5.1 |
| LDC2005S26 | CSLU: 22 Languages Corpus |
| LDC2005T08 | Discourse Graphbank |
| LDC2005T12 | English Gigaword Second Edition |
| LDC2005S13 | Fisher English Training Part 2, Speech |
| LDC2005T19 | Fisher English Training Part 2, Transcripts |
| LDC2005T28 | HARD 2004 Text |
| LDC2005T29 | HARD 2004 Topics and Annotations |
| LDC2005S15 | HKUST Mandarin Telephone Speech, Part 1 |
| LDC2005T32 | HKUST Mandarin Telephone Transcript Data, Part 1 |
| LDC2005S14 | Levantine Arabic QT Training Data Set 4 (Speech + Transcripts) |
| LDC2005L01 | Mawukakan Lexicon |
| LDC2005T05 | Multiple-Translation Arabic (MTA) Part 2 |
| LDC2005S16 | RT-04 MDE Training Data Speech |
| LDC2005T24 | RT-04 MDE Training Data Text/Annotations |
| LDC2005S25 | Santa Barbara Corpus of Spoken American English Part IV |
| LDC2005S11 | TDT4 Multilingual Broadcast News Speech Corpus |
| LDC2005T16 | TDT4 Multilingual Text and Annotations |
| LDC2005S30 | West Point Company G3 American English Speech |
| LDC2005S28 | West Point Croatian Speech |
2004 [ top ] |
| LDC2004T15 | 2000 Communicator Dialogue Act Tagged |
| LDC2004T16 | 2001 Communicator Dialogue Act Tagged |
| LDC2004S04 | 2002 NIST Speaker Recognition Evaluation |
| LDC2004S11 | 2002 Rich Transcription Broadcast News and Conversational Telephone Speech |
| LDC2004T18 | Arabic English Parallel News Part 1 |
| LDC2004T17 | Arabic News Translation Text Part 1 |
| LDC2004T02 | Arabic Treebank: Part 2 v 2.0 |
| LDC2004T11 | Arabic Treebank: Part 3 v 1.0 |
| LDC2004L02 | Buckwalter Arabic Morphological Analyzer Version 2.0 |
| LDC2004T05 | Chinese Treebank 4.0 |
| LDC2004S01 | Czech Broadcast News Speech |
| LDC2004T01 | Czech Broadcast News Transcripts |
| LDC2004S13 | Fisher English Training Speech Part 1 Speech |
| LDC2004T19 | Fisher English Training Speech Part 1 Transcripts |
| LDC2004V01 | FORM1 Kinematic Gesture |
| LDC2004T08 | Hong Kong Parallel Text |
| LDC2004S02 | ICSI Meeting Speech |
| LDC2004T04 | ICSI Meeting Transcripts |
| LDC2004S05 | ISL Meeting Speech Part 1 |
| LDC2004T10 | ISL Meeting Transcripts Part 1 |
| LDC2004L01 | Klex: Finite-State Lexical Transducer for Korean |
| LDC2004T03 | Morphologically Annotated Korean Text |
| LDC2004T07 | Multiple-Translation Chinese (MTC) Part 3 |
| LDC2004S09 | NIST Meeting Pilot Corpus Speech |
| LDC2004T13 | NIST Meeting Pilot Corpus Transcripts and Metadata |
| LDC2004T23 | Prague Arabic Dependency Treebank 1.0 |
| LDC2004T25 | Prague Czech-English Dependency Treebank 1.0 |
| LDC2004T14 | Proposition Bank I |
| LDC2004S08 | RT-03 MDE Training Data Speech |
| LDC2004T12 | RT-03 MDE Training Data Text and Annotations |
| LDC2004S10 | Santa Barbara Corpus of Spoken American English Part III |
| LDC2004S07 | Switchboard Cellular Part 2 Audio |
| LDC2004S12 | TalkBank Ethology Data: Field Recordings of Vervet Monkey Calls |
| LDC2004T09 | TIDES Extraction (ACE) 2003 Multilingual Training Data |
2003 [ top ] |
| LDC2003T03 | 1997 HUB5 German Transcripts |
| LDC2003T04 | 1997 HUB5 Spanish Transcripts |
| LDC2003T02 | 1998 HUB5 English Transcripts |
| LDC2003S01 | 2001 Communicator Evaluation |
| LDC2003T01 | 2001 HUB5 Mandarin Transcripts |
| LDC2003T11 | ACE-2 Version 1.0 |
| LDC2003T12 | Arabic Gigaword |
| LDC2003T07 | Arabic Treebank: Part 1 - 10K-word English Translation |
| LDC2003T06 | Arabic Treebank: Part 1 v 2.0 |
| LDC2003T09 | Chinese Gigaword |
| LDC2003T05 | English Gigaword |
| LDC2003V01 | FORM2 Kinematic Gesture |
| LDC2003L01 | Grassfields Bantu Fieldwork: Dschang Lexicon |
| LDC2003S02 | Grassfields Bantu Fieldwork: Dschang Tone Paradigms |
| LDC2003S07 | Korean Telephone Conversations Complete Set |
| LDC2003L02 | Korean Telephone Conversations Lexicon |
| LDC2003S03 | Korean Telephone Conversations Speech |
| LDC2003T08 | Korean Telephone Conversations Transcripts |
| LDC2003T13 | Message Understanding Conference (MUC) 6 |
| LDC2003T18 | Multiple-Translation Arabic (MTA) Part 1 |
| LDC2003T17 | Multiple-Translation Chinese (MTC) Part 2 |
| LDC2003T10 | SAID |
| LDC2003S06 | Santa Barbara Corpus of Spoken American English Part II |
| LDC2003T15 | SLX Corpus of Classic Sociolinguistic Interviews |
| LDC2003T16 | SummBank 1.0 |
| LDC2003S05 | West Point Russian Speech |
2002 [ top ] |
| LDC2002S11 | 1997 HUB4 English Evaluation Speech and Transcripts |
| LDC2002S22 | 1997 HUB5 Arabic Evaluation |
| LDC2002T39 | 1997 HUB5 Arabic Transcripts |
| LDC2002S24 | 1997 HUB5 German Evaluation |
| LDC2002S25 | 1997 HUB5 Spanish Evaluation |
| LDC2002S10 | 1998 HUB5 English Evaluation |
| LDC2002S56 | 2000 Communicator Evaluation |
| LDC2002S13 | 2001 HUB5 English Evaluation |
| LDC2002S12 | 2001 HUB5 Mandarin Evaluation |
| LDC2002S34 | 2001 NIST Speaker Recognition Evaluation Corpus |
| LDC2002L49 | Buckwalter Arabic Morphological Analyzer Version 1.0 |
| LDC2002S37 | CALLHOME Egyptian Arabic Speech Supplement |
| LDC2002T38 | CALLHOME Egyptian Arabic Transcripts Supplement |
| LDC2002L27 | Chinese-English Translation Lexicon Version 3.0 |
| LDC2002S28 | Emotional Prosody Speech and Transcripts |
| LDC2002T26 | Korean English Treebank Annotations |
| LDC2002T01 | Multiple-Translation Chinese Corpus |
| LDC2002T07 | RST Discourse Treebank |
| LDC2002S06 | Switchboard-2 Phase III Audio |
| LDC2002T31 | The AQUAINT Corpus of English News Text |
| LDC2002S04 | Translanguage English Database (TED) Speech |
| LDC2002T03 | Translanguage English Database (TED) Transcripts |
| LDC2002S35 | Voicemail Corpus Part II |
| LDC2002S02 | West Point Arabic Speech |
2001 [ top ] |
| LDC2001S91 | 1997 HUB4 Broadcast News Evaluation Non-English Test Material |
| LDC2001S97 | 2000 NIST Speaker Recognition Evaluation |
| LDC2001T55 | Arabic Newswire Part 1 |
| LDC2001T61 | CALLHOME Spanish Dialogue Act Annotation |
| LDC2001T62 | CETEMpublico |
| LDC2001T11 | Chinese Treebank 2.0 |
| LDC2001S16 | Grassfields Bantu Fieldwork: Ngomba Tone Paradigms |
| LDC2001T02 | Message Understanding Conference (MUC) 7 |
| LDC2001T10 | Prague Dependency Treebank 1.0 |
| LDC2001S04 | Speech in Noisy Environments (SPINE2) Part 1 Audio |
| LDC2001T05 | Speech in Noisy Environments (SPINE2) Part 1 Transcripts |
| LDC2001S06 | Speech in Noisy Environments (SPINE2) Part 2 Audio |
| LDC2001T07 | Speech in Noisy Environments (SPINE2) Part 2 Transcripts |
| LDC2001S08 | Speech in Noisy Environments (SPINE2) Part 3 Audio |
| LDC2001T09 | Speech in Noisy Environments (SPINE2) Part 3 Transcripts |
| LDC2001S99 | Speech in Noisy Environments 1 (SPINE1 CODED) Coded Audio |
| LDC2001S13 | Switchboard Cellular Part 1 Audio |
| LDC2001S15 | Switchboard Cellular Part 1 Transcribed Audio |
| LDC2001T14 | Switchboard Cellular Part 1 Transcription |
| LDC2001T60 | Syllable-Final /s/ Lenition |
| LDC2001S93 | TDT2 Mandarin Audio Corpus |
| LDC2001T57 | TDT2 Multilanguage Text Version 4.0 |
| LDC2001S94 | TDT3 English Audio |
| LDC2001S95 | TDT3 Mandarin Audio |
| LDC2001T58 | TDT3 Multilanguage Text Version 2.0 |
2000 [ top ] |
| LDC2000S86 | 1998 HUB4 Broadcast News Evaluation English Test Material |
| LDC2000S88 | 1999 HUB4 Broadcast News Evaluation English Test Material |
| LDC2000T43 | BLLIP 1987-89 WSJ Corpus Release 1 |
| LDC2000T50 | Hong Kong Hansards Parallel Text |
| LDC2000T47 | Hong Kong Laws Parallel Text |
| LDC2000T46 | Hong Kong News Parallel Text |
| LDC2000T45 | Korean Newswire |
| LDC2000S85 | Santa Barbara Corpus of Spoken American English Part I |
| LDC2000S96 | Speech in Noisy Environments (SPINE) Evaluation Audio |
| LDC2000T54 | Speech in Noisy Environments (SPINE) Evaluation Transcripts |
| LDC2000S87 | Speech in Noisy Environments (SPINE) Training Audio |
| LDC2000T49 | Speech in Noisy Environments (SPINE) Training Transcripts |
| LDC2000S92 | TDT2 Careful Transcription Audio |
| LDC2000T44 | TDT2 Careful Transcription Text |
| LDC2000T52 | TREC Mandarin |
| LDC2000T51 | TREC Spanish |
| LDC2000T53 | Voice of America (VOA) Broadcast News Czech Transcript Corpus |
| LDC2000S89 | Voice of America (VOA) Czech Broadcast News Audio |
1999 [ top ] |
| LDC99S80 | 1997 Speaker Recognition Benchmark |
| LDC99S81 | 1999 Speaker Recognition Benchmark |
| LDC99L23 | American English Spoken Lexicon |
| LDC99L22 | Egyptian Colloquial Arabic Lexicon |
| LDC99T34 | Japanese Business News Text Supplement |
| LDC99T40 | Portuguese Newswire Text |
| LDC99T41 | Spanish Newswire Text, Volume 2 |
| LDC99S78 | SUSAS |
| LDC99T33 | SUSAS Transcripts |
| LDC99S79 | Switchboard-2 Phase II |
| LDC99S83 | Tactical Speaker Identification Speech Corpus (TSID) |
| LDC99S84 | TDT2 English Audio |
| LDC99T42 | Treebank-3 |
| LDC99S82 | USC Marketplace Broadcast News Speech |
| LDC99T36 | USC Marketplace Broadcast News Transcripts |
1998 [ top ] |
| LDC98T31 | 1996 CSR HUB4 Language Model |
| LDC98S71 | 1997 English Broadcast News Speech (HUB4) |
| LDC98T28 | 1997 English Broadcast News Transcripts (HUB4) |
| LDC98S73 | 1997 Mandarin Broadcast News Speech (HUB4-NE) |
| LDC98T24 | 1997 Mandarin Broadcast News Transcripts (HUB4-NE) |
| LDC98S74 | 1997 Spanish Broadcast News Speech (HUB4-NE) |
| LDC98T29 | 1997 Spanish Broadcast News Transcripts (HUB4-NE) |
| LDC98S76 | 1998 Speaker Recognition Benchmark |
| LDC98L21 | COMLEX English Syntax Lexicon |
| LDC98S67 | HTIMIT |
| LDC98S69 | HUB5 Mandarin Telephone Speech Corpus |
| LDC98T26 | HUB5 Mandarin Transcripts |
| LDC98S70 | HUB5 Spanish Telephone Speech Corpus |
| LDC98T27 | HUB5 Spanish Transcripts |
| LDC98T32 | JURIS |
| LDC98S68 | LLHDB |
| LDC98T30 | North American News Text Supplement |
| LDC98S75 | Switchboard-2 Phase I |
| LDC98S72 | Taiwanese Putonghua Speech and Transcripts |
| LDC98T25 | TDT Pilot Study Corpus |
| LDC98S77 | Voicemail Corpus Part I |
1997 [ top ] |
| LDC97S66 | 1996 English Broadcast News Dev and Eval (HUB4) |
| LDC97S44 | 1996 English Broadcast News Speech (HUB4) |
| LDC97T22 | 1996 English Broadcast News Transcripts (HUB4) |
| LDC97L20 | CALLHOME American English Lexicon (PRONLEX) |
| LDC97S42 | CALLHOME American English Speech |
| LDC97T14 | CALLHOME American English Transcripts |
| LDC97S45 | CALLHOME Egyptian Arabic Speech |
| LDC97T19 | CALLHOME Egyptian Arabic Transcripts |
| LDC97L18 | CALLHOME German Lexicon |
| LDC97S43 | CALLHOME German Speech |
| LDC97T15 | CALLHOME German Transcripts |
| LDC97T12 | DSO Corpus of Sense-Tagged English |
| LDC97S62 | Switchboard-1 Release 2 |
| LDC97S63 | The CMU Kids Corpus |
1996 [ top ] |
| LDC96S61 | 1996 Speaker Recognition Benchmark |
| LDC96S36 | Boston University Radio Speech Corpus |
| LDC96S46 | CALLFRIEND American English-Non-Southern Dialect |
| LDC96S47 | CALLFRIEND American English-Southern Dialect |
| LDC96S48 | CALLFRIEND Canadian French |
| LDC96S49 | CALLFRIEND Egyptian Arabic |
| LDC96S50 | CALLFRIEND Farsi |
| LDC96S51 | CALLFRIEND German |
| LDC96S52 | CALLFRIEND Hindi |
| LDC96S53 | CALLFRIEND Japanese |
| LDC96S54 | CALLFRIEND Korean |
| LDC96S55 | CALLFRIEND Mandarin Chinese-Mainland Dialect |
| LDC96S56 | CALLFRIEND Mandarin Chinese-Taiwan Dialect |
| LDC96S57 | CALLFRIEND Spanish-Caribbean Dialect |
| LDC96S58 | CALLFRIEND Spanish-Non-Caribbean Dialect |
| LDC96S59 | CALLFRIEND Tamil |
| LDC96S60 | CALLFRIEND Vietnamese |
| LDC96L17 | CALLHOME Japanese Lexicon |
| LDC96S37 | CALLHOME Japanese Speech |
| LDC96T18 | CALLHOME Japanese Transcripts |
| LDC96L15 | CALLHOME Mandarin Chinese Lexicon |
| LDC96S34 | CALLHOME Mandarin Chinese Speech |
| LDC96T16 | CALLHOME Mandarin Chinese Transcripts |
| LDC96L16 | CALLHOME Spanish Lexicon |
| LDC96S35 | CALLHOME Spanish Speech |
| LDC96T17 | CALLHOME Spanish Transcripts |
| LDC96L14 | CELEX2 |
| LDC96T11 | COMLEX Syntax Text Corpus Version 2.0 |
| LDC96S33 | CSR-IV HUB3 |
| LDC96S31 | CSR-IV HUB4 |
| LDC96S30 | CTIMIT |
| LDC96S38 | DCIEM/HCRC |
| LDC96S32 | FFMTIMIT |
| LDC96S29 | Frontiers in Speech Processing 93 |
| LDC96S40 | Frontiers in Speech Processing 94 |
| LDC96S64-1 | JEIDA/JCSD-Channel 0 City Names |
| LDC96S64 | JEIDA/JCSD-Channel 0 Complete |
| LDC96S64-2 | JEIDA/JCSD-Channel 0 Control Words |
| LDC96S64-4 | JEIDA/JCSD-Channel 0 Four Digit Sequences |
| LDC96S64-3 | JEIDA/JCSD-Channel 0 Isolated Digits |
| LDC96S64-5 | JEIDA/JCSD-Channel 0 Mono Syllables |
| LDC96S65-1 | JEIDA/JCSD-Channel 1 City Names |
| LDC96S65 | JEIDA/JCSD-Channel 1 Complete |
| LDC96S65-2 | JEIDA/JCSD-Channel 1 Control Words |
| LDC96S65-4 | JEIDA/JCSD-Channel 1 Four Digit Sequences |
| LDC96S65-3 | JEIDA/JCSD-Channel 1 Isolated Digits |
| LDC96S65-5 | JEIDA/JCSD-Channel 1 Mono Syllables |
| LDC96T10 | Message Understanding Conference (MUC) 6 Additional News Text |
| LDC96S41 | VAHA (POLYPHONE II) |
1995 [ top ] |
| LDC95S26 | ATIS3 Test Data |
| LDC95S23 | CSR-III Speech |
| LDC95T6 | CSR-III Text |
| LDC95T11 | European Language Newspaper Text |
| LDC95T20 | Hansard French/English |
| LDC95T8 | Japanese Business News Text |
| LDC95S22 | KING Speaker Verification |
| LDC95S28 | LATINO-40 Spanish Read News |
| LDC95T13 | Mandarin Chinese News Text |
| LDC95T21 | North American News Text Corpus |
| LDC95S27 | PhoneBook: NYNEX Isolated Words |
| LDC95T9 | Spanish News Text |
| LDC95S25 | TRAINS Spoken Dialog Corpus |
| LDC95T7 | Treebank-2 |
| LDC95S24 | WSJCAM0 Cambridge Read News |
1994 [ top ] |
| LDC94S14B | Air Traffic Control BOS |
| LDC94S14A | Air Traffic Control Complete |
| LDC94S14C | Air Traffic Control DCA |
| LDC94S14D | Air Traffic Control DFW |
| LDC94S19 | ATIS3 Training Data |
| LDC94S20 | BRAMSHILL |
| LDC94S13A | CSR-II (WSJ1) Complete |
| LDC94S13C | CSR-II (WSJ1) Other |
| LDC94S13B | CSR-II (WSJ1) Sennheiser |
| LDC94T5 | ECI Multilingual Text |
| LDC94S21 | MACROPHONE |
| LDC94S17 | OGI Multilanguage Corpus |
| LDC94S18 | OGI Spelled and Spoken Word |
| LDC94S15 | SPIDRE |
| LDC94T4A | UN Parallel Text (Complete) |
| LDC94T4B-1 | UN Parallel Text (English) |
| LDC94T4B-2 | UN Parallel Text (French) |
| LDC94T4B-3 | UN Parallel Text (Spanish) |
| LDC94S16 | YOHO Speaker Verification |
1993 [ top ] |
| LDC93T1 | ACL/DCI |
| LDC93S4A | ATIS0 Complete |
| LDC93S4B | ATIS0 Pilot |
| LDC93S4B-2 | ATIS0 Read |
| LDC93S4B-3 | ATIS0 SD Read |
| LDC93S5 | ATIS2 |
| LDC93S6A | CSR-I (WSJ0) Complete |
| LDC93S6C | CSR-I (WSJ0) Other |
| LDC93S6B | CSR-I (WSJ0) Sennheiser |
| LDC93S12 | HCRC Map Task Corpus |
| LDC93S2 | NTIMIT |
| LDC93S3A | Resource Management Complete Set 2.0 |
| LDC93S3B | Resource Management RM1 2.0 |
| LDC93S3C | Resource Management RM2 2.0 |
| LDC93S11 | Road Rally |
| LDC93S8 | Switchboard Credit Card |
| LDC93T4 | Switchboard-1 Transcripts |
| LDC93S9 | TI 46-Word |
| LDC93S10 | TIDIGITS |
| LDC93S1 | TIMIT Acoustic-Phonetic Continuous Speech Corpus |
| LDC93T3A | TIPSTER Complete |
| LDC93T3B | TIPSTER Volume 1 |
| LDC93T3C | TIPSTER Volume 2 |
| LDC93T3D | TIPSTER Volume 3 |