| Catalog Number |
Corpus Name |
| LDC98T31 |
1996 CSR HUB4 Language Model |
| LDC97S66 |
1996 English Broadcast News Dev and Eval (HUB4) |
| LDC97S44 |
1996 English Broadcast News Speech (HUB4) |
| LDC97T22 |
1996 English Broadcast News Transcripts (HUB4) |
| LDC96S61 |
1996 Speaker Recognition Benchmark |
| LDC98S71 |
1997 English Broadcast News Speech (HUB4) |
| LDC98T28 |
1997 English Broadcast News Transcripts (HUB4) |
| LDC2001S91 |
1997 HUB4 Broadcast News Evaluation Non-English Test Material |
| LDC2002S11 |
1997 HUB4 English Evaluation Speech and Transcripts |
| LDC2002S22 |
1997 HUB5 Arabic Evaluation |
| LDC2002T39 |
1997 HUB5 Arabic Transcripts |
| LDC2002S24 |
1997 HUB5 German Evaluation |
| LDC2003T03 |
1997 HUB5 German Transcripts |
| LDC2002S25 |
1997 HUB5 Spanish Evaluation |
| LDC2003T04 |
1997 HUB5 Spanish Transcripts |
| LDC98S73 |
1997 Mandarin Broadcast News Speech (HUB4-NE) |
| LDC98T24 |
1997 Mandarin Broadcast News Transcripts (HUB4-NE) |
| LDC98S74 |
1997 Spanish Broadcast News Speech (HUB4-NE) |
| LDC98T29 |
1997 Spanish Broadcast News Transcripts (HUB4-NE) |
| LDC99S80 |
1997 Speaker Recognition Benchmark |
| LDC2000S86 |
1998 HUB4 Broadcast News Evaluation English Test Material |
| LDC2002S10 |
1998 HUB5 English Evaluation |
| LDC2003T02 |
1998 HUB5 English Transcripts |
| LDC98S76 |
1998 Speaker Recognition Benchmark |
| LDC2000S88 |
1999 HUB4 Broadcast News Evaluation English Test Material |
| LDC99S81 |
1999 Speaker Recognition Benchmark |
| LDC2004T15 |
2000 Communicator Dialogue Act Tagged |
| LDC2001S97 |
2000 NIST Speaker Recognition Evaluation |
| LDC2004T16 |
2001 Communicator Dialogue Act Tagged |
| LDC2003S01 |
2001 Communicator Evaluation |
| LDC2002S13 |
2001 HUB5 English Evaluation |
| LDC2002S12 |
2001 HUB5 Mandarin Evaluation |
| LDC2003T01 |
2001 HUB5 Mandarin Transcripts |
| LDC2002S34 |
2001 NIST Speaker Recognition Evaluation Corpus |
| LDC2007T22 |
2001 Topic Annotated Enron Email Data Set |
| LDC2004S04 |
2002 NIST Speaker Recognition Evaluation |
| LDC2004S11 |
2002 Rich Transcription Broadcast News and Conversational Telephone Speech |
| LDC2006S31 |
2003 NIST Language Recognition Evaluation |
| LDC2007S10 |
2003 NIST Rich Transcription Evaluation Data |
| LDC2006S44 |
2004 NIST Speaker Recognition Evaluation |
| LDC2007S12 |
2004 Spring NIST Rich Transcription (RT-04S) Evaluation Data |
| LDC2007S11 |
2004 Spring NIST Rich Transcription (RT-04S) Development Data |
| LDC2008S05 |
2005 NIST Language Recognition Evaluation |
| LDC2009S04 |
2007 NIST Language Recognition Evaluation Test Set |
| LDC2009S05 |
2007 NIST Language Recognition Evaluation Supplemental Training Set |
| LDC2009T12 |
2008 CoNLL Shared Task Data |
| LDC2009T05 |
2008 NIST Metrics for Machine Translation (MetricsMATR08) Development Data |
| LDC2005T09 |
ACE 2004 Multilingual Training Corpus |
| LDC2008T03 |
ACE 2005 English SpatialML Annotations |
| LDC2006T06 |
ACE 2005 Multilingual Training Corpus |
| LDC2005T07 |
ACE Time Normalization (TERN) 2004 English Training Data v 1.0 |
| LDC2003T11 |
ACE-2 Version 1.0 |
| LDC93T1 |
ACL/DCI |
| LDC94S14B |
Air Traffic Control BOS |
| LDC94S14A |
Air Traffic Control Complete |
| LDC94S14C |
Air Traffic Control DCA |
| LDC94S14D |
Air Traffic Control DFW |
| LDC99L23 |
American English Spoken Lexicon |
| LDC2005T35 |
American National Corpus (ANC) Second Release |
| LDC2008L01 |
An English Dictionary of the Tamil Verb |
| LDC2009L01 |
An English Dictionary of the Tamil Verb Second Edition |
| LDC2008T25 |
AQUAINT-2 Information-Retrieval Text Research Collection |
| LDC2006S46 |
Arabic Broadcast News Speech |
| LDC2006T20 |
Arabic Broadcast News Transcripts |
| LDC2005S07 |
Arabic CTS Levantine Fisher Training Data Set 3, Speech |
| LDC2005T03 |
Arabic CTS Levantine Fisher Training Data Set 3, Transcripts |
| LDC2004T18 |
Arabic English Parallel News Part 1 |
| LDC2003T12 |
Arabic Gigaword |
| LDC2006T02 |
Arabic Gigaword Second Edition |
| LDC2007T40 |
Arabic Gigaword Third Edition |
| LDC2004T17 |
Arabic News Translation Text Part 1 |
| LDC2009T22 |
Arabic Newswire English Translation Collection |
| LDC2001T55 |
Arabic Newswire Part 1 |
| LDC2003T07 |
Arabic Treebank: Part 1 - 10K-word English Translation |
| LDC2003T06 |
Arabic Treebank: Part 1 v 2.0 |
| LDC2005T02 |
Arabic Treebank: Part 1 v 3.0 (POS with full vocalization + syntactic analysis) |
| LDC2004T02 |
Arabic Treebank: Part 2 v 2.0 |
| LDC2005T20 |
Arabic Treebank: Part 3 (full corpus) v 2.0 (MPG + Syntactic Analysis) |
| LDC2004T11 |
Arabic Treebank: Part 3 v 1.0 |
| LDC2005T30 |
Arabic Treebank: Part 4 v 1.0 (MPG Annotation) |
| LDC2007S03 |
ARL Urdu Speech Database, Training Data |
| LDC2005S22 |
Articulation Index |
| LDC93S4A |
ATIS0 Complete |
| LDC93S4B |
ATIS0 Pilot |
| LDC93S4B-2 |
ATIS0 Read |
| LDC93S4B-3 |
ATIS0 SD Read |
| LDC93S5 |
ATIS2 |
| LDC95S26 |
ATIS3 Test Data |
| LDC94S19 |
ATIS3 Training Data |
| LDC2009V01 |
Audiovisual Database of Spoken American English |
| LDC2005T33 |
BBN Pronoun Coreference and Entity Type Corpus |
| LDC2005S08 |
BBN/AUB DARPA Babylon Levantine Arabic Speech and Transcripts |
| LDC2009T04 |
BioProp Version 1.0 |
| LDC2000T43 |
BLLIP 1987-89 WSJ Corpus Release 1 |
| LDC2008T13 |
BLLIP North American News Text, Complete |
| LDC2008T14 |
BLLIP North American News Text, General Release |
| LDC96S36 |
Boston University Radio Speech Corpus |
| LDC94S20 |
BRAMSHILL |
| LDC2002L49 |
Buckwalter Arabic Morphological Analyzer Version 1.0 |
| LDC2004L02 |
Buckwalter Arabic Morphological Analyzer Version 2.0 |
| LDC96S46 |
CALLFRIEND American English-Non-Southern Dialect |
| LDC96S47 |
CALLFRIEND American English-Southern Dialect |
| LDC96S48 |
CALLFRIEND Canadian French |
| LDC96S49 |
CALLFRIEND Egyptian Arabic |
| LDC96S50 |
CALLFRIEND Farsi |
| LDC96S51 |
CALLFRIEND German |
| LDC96S52 |
CALLFRIEND Hindi |
| LDC96S53 |
CALLFRIEND Japanese |
| LDC96S54 |
CALLFRIEND Korean |
| LDC96S55 |
CALLFRIEND Mandarin Chinese-Mainland Dialect |
| LDC96S56 |
CALLFRIEND Mandarin Chinese-Taiwan Dialect |
| LDC96S57 |
CALLFRIEND Spanish-Caribbean Dialect |
| LDC96S58 |
CALLFRIEND Spanish-Non-Caribbean Dialect |
| LDC96S59 |
CALLFRIEND Tamil |
| LDC96S60 |
CALLFRIEND Vietnamese |
| LDC97L20 |
CALLHOME American English Lexicon (PRONLEX) |
| LDC97S42 |
CALLHOME American English Speech |
| LDC97T14 |
CALLHOME American English Transcripts |
| LDC97S45 |
CALLHOME Egyptian Arabic Speech |
| LDC2002S37 |
CALLHOME Egyptian Arabic Speech Supplement |
| LDC97T19 |
CALLHOME Egyptian Arabic Transcripts |
| LDC2002T38 |
CALLHOME Egyptian Arabic Transcripts Supplement |
| LDC97L18 |
CALLHOME German Lexicon |
| LDC97S43 |
CALLHOME German Speech |
| LDC97T15 |
CALLHOME German Transcripts |
| LDC96L17 |
CALLHOME Japanese Lexicon |
| LDC96S37 |
CALLHOME Japanese Speech |
| LDC96T18 |
CALLHOME Japanese Transcripts |
| LDC96L15 |
CALLHOME Mandarin Chinese Lexicon |
| LDC96S34 |
CALLHOME Mandarin Chinese Speech |
| LDC96T16 |
CALLHOME Mandarin Chinese Transcripts |
| LDC2008T17 |
CALLHOME Mandarin Chinese Transcripts - XML version |
| LDC2001T61 |
CALLHOME Spanish Dialogue Act Annotation |
| LDC96L16 |
CALLHOME Spanish Lexicon |
| LDC96S35 |
CALLHOME Spanish Speech |
| LDC96T17 |
CALLHOME Spanish Transcripts |
| LDC2005T13 |
CCGbank |
| LDC96L14 |
CELEX2 |
| LDC2001T62 |
CETEMpublico |
| LDC2008S09 |
CHAracterizing INdividual Speakers(CHAINS) |
| LDC2005T34 |
Chinese <-> English Name Entity Lists v 1.0 |
| LDC2005T10 |
Chinese English News Magazine Parallel Text |
| LDC2003T09 |
Chinese Gigaword |
| LDC2009T27 |
Chinese Gigaword Fourth Edition |
| LDC2005T14 |
Chinese Gigaword Second Edition |
| LDC2007T38 |
Chinese Gigaword Third Edition |
| LDC2005T06 |
Chinese News Translation Text Part 1 |
| LDC2005T23 |
Chinese Proposition Bank 1.0 |
| LDC2008T07 |
Chinese Proposition Bank 2.0 |
| LDC2001T11 |
Chinese Treebank 2.0 |
| LDC2004T05 |
Chinese Treebank 4.0 |
| LDC2005T01 |
Chinese Treebank 5.0 |
| LDC2005T01U01 |
Chinese Treebank 5.1 |
| LDC2007T36 |
Chinese Treebank 6.0 |
| LDC2002L27 |
Chinese-English Translation Lexicon Version 3.0 |
| LDC98L21 |
COMLEX English Syntax Lexicon |
| LDC96T11 |
COMLEX Syntax Text Corpus Version 2.0 |
| LDC2008T24 |
COMNOM v 1.0 |
| LDC2007S08 |
CSLU: Foreign Accented English Release 1.2 |
| LDC2007S18 |
CSLU: Kids` Speech Version 1.1 |
| LDC2006S15 |
CSLU: Spelled and Spoken Words |
| LDC2006S14 |
CSLU: Stories v 1.2 |
| LDC2008S06 |
CSLU: Alphadigit Version 1.3 |
| LDC2007S13 |
CSLU: Apple Words and Phrases |
| LDC2008S07 |
CSLU: ISOLET Spoken Letter Database Version 1.3 |
| LDC2006S35 |
CSLU: Multilanguage Telephone Speech Version 1.2 |
| LDC2006S39 |
CSLU: Names Release 1.3 |
| LDC2008S02 |
CSLU: National Cellular Telephone Speech Release 2.3 |
| LDC2009S01 |
CSLU: Numbers Version 1.3 |
| LDC2008S01 |
CSLU: Portland Cellular Telephone Speech Version 1.3 |
| LDC2009S03 |
CSLU: S4X Release 1.2 |
| LDC2006S26 |
CSLU: Speaker Recognition Version 1.1 |
| LDC2006S16 |
CSLU: Spoltech Brazilian Portuguese Version 1.0 |
| LDC2006S01 |
CSLU: Voices |
| LDC2007S05 |
CSLU: Yes/No Version 1.2 |
| LDC93S6A |
CSR-I (WSJ0) Complete |
| LDC93S6C |
CSR-I (WSJ0) Other |
| LDC93S6B |
CSR-I (WSJ0) Sennheiser |
| LDC94S13A |
CSR-II (WSJ1) Complete |
| LDC94S13C |
CSR-II (WSJ1) Other |
| LDC94S13B |
CSR-II (WSJ1) Sennheiser |
| LDC95S23 |
CSR-III Speech |
| LDC95T6 |
CSR-III Text |
| LDC96S33 |
CSR-IV HUB3 |
| LDC96S31 |
CSR-IV HUB4 |
| LDC96S30 |
CTIMIT |
| LDC2008T22 |
Czech Academic Corpus 2.0 |
| LDC2009T20 |
Czech Broadcast Conversation MDE Transcripts |
| LDC2009S02 |
Czech Broadcast Conversation Speech |
| LDC2004S01 |
Czech Broadcast News Speech |
| LDC2004T01 |
Czech Broadcast News Transcripts |
| LDC96S38 |
DCIEM/HCRC |
| LDC2005T08 |
Discourse Graphbank |
| LDC97T12 |
DSO Corpus of Sense-Tagged English |
| LDC94T5 |
ECI Multilingual Text |
| LDC99L22 |
Egyptian Colloquial Arabic Lexicon |
| LDC2002S28 |
Emotional Prosody Speech and Transcripts |
| LDC2007T02 |
English Chinese Translation Treebank v 1.0 |
| LDC2009T01 |
English CTS Treebank with Structural Metadata |
| LDC2003T05 |
English Gigaword |
| LDC2009T13 |
English Gigaword Fourth Edition |
| LDC2005T12 |
English Gigaword Second Edition |
| LDC2007T07 |
English Gigaword Third Edition |
| LDC2006T10 |
English-Arabic Treebank v 1.0 |
| LDC95T11 |
European Language Newspaper Text |
| LDC2009T23 |
FactBank 1.0 |
| LDC96S32 |
FFMTIMIT |
| LDC2005S13 |
Fisher English Training Part 2, Speech |
| LDC2005T19 |
Fisher English Training Part 2, Transcripts |
| LDC2004S13 |
Fisher English Training Speech Part 1 Speech |
| LDC2004T19 |
Fisher English Training Speech Part 1 Transcripts |
| LDC2007S02 |
Fisher Levantine Arabic Conversational Telephone Speech |
| LDC2007T04 |
Fisher Levantine Arabic Conversational Telephone Speech, Transcripts |
| LDC2004V01 |
FORM1 Kinematic Gesture |
| LDC2003V01 |
FORM2 Kinematic Gesture |
| LDC2006T17 |
French Gigaword First Edition |
| LDC2009T28 |
French Gigaword Second Edition |
| LDC96S29 |
Frontiers in Speech Processing 93 |
| LDC96S40 |
Frontiers in Speech Processing 94 |
| LDC2008T02 |
GALE Phase 1 Arabic Blog Parallel Text |
| LDC2007T24 |
GALE Phase 1 Arabic Broadcast News Parallel Text - Part 1 |
| LDC2008T09 |
GALE Phase 1 Arabic Broadcast News Parallel Text - Part 2 |
| LDC2009T03 |
GALE Phase 1 Arabic Newsgroup Parallel Text - Part 1 |
| LDC2009T09 |
GALE Phase 1 Arabic Newsgroup Parallel Text - Part 2 |
| LDC2008T06 |
GALE Phase 1 Chinese Blog Parallel Text |
| LDC2009T02 |
GALE Phase 1 Chinese Broadcast Conversation Parallel Text - Part 1 |
| LDC2009T06 |
GALE Phase 1 Chinese Broadcast Conversation Parallel Text - Part 2 |
| LDC2007T23 |
GALE Phase 1 Chinese Broadcast News Parallel Text - Part 1 |
| LDC2008T08 |
GALE Phase 1 Chinese Broadcast News Parallel Text - Part 2 |
| LDC2008T18 |
GALE Phase 1 Chinese Broadcast News Parallel Text - Part 3 |
| LDC2009T15 |
GALE Phase 1 Chinese Newsgroup Parallel Text - Part 1 |
| LDC2007T20 |
GALE Phase 1 Distillation Training |
| LDC2008L03 |
Global Yoruba Lexical Database v. 1.0 |
| LDC2003L01 |
Grassfields Bantu Fieldwork: Dschang Lexicon |
| LDC2003S02 |
Grassfields Bantu Fieldwork: Dschang Tone Paradigms |
| LDC2001S16 |
Grassfields Bantu Fieldwork: Ngomba Tone Paradigms |
| LDC2006S43 |
Gulf Arabic Conversational Telephone Speech |
| LDC2006T15 |
Gulf Arabic Conversational Telephone Speech, Transcripts |
| LDC95T20 |
Hansard French/English |
| LDC2005T28 |
HARD 2004 Text |
| LDC2005T29 |
HARD 2004 Topics and Annotations |
| LDC93S12 |
HCRC Map Task Corpus |
| LDC2008L02 |
Hindi WordNet |
| LDC2005S15 |
HKUST Mandarin Telephone Speech, Part 1 |
| LDC2005T32 |
HKUST Mandarin Telephone Transcript Data, Part 1 |
| LDC2000T50 |
Hong Kong Hansards Parallel Text |
| LDC2000T47 |
Hong Kong Laws Parallel Text |
| LDC2000T46 |
Hong Kong News Parallel Text |
| LDC2004T08 |
Hong Kong Parallel Text |
| LDC98S67 |
HTIMIT |
| LDC98S69 |
HUB5 Mandarin Telephone Speech Corpus |
| LDC98T26 |
HUB5 Mandarin Transcripts |
| LDC98S70 |
HUB5 Spanish Telephone Speech Corpus |
| LDC98T27 |
HUB5 Spanish Transcripts |
| LDC2008T01 |
Hungarian-English Parallel Text, Version 1.0 |
| LDC2004S02 |
ICSI Meeting Speech |
| LDC2004T04 |
ICSI Meeting Transcripts |
| LDC2006S45 |
Iraqi Arabic Conversational Telephone Speech |
| LDC2006T16 |
Iraqi Arabic Conversational Telephone Speech, Transcripts |
| LDC2007T08 |
ISI Arabic-English Automatically Extracted Parallel Text |
| LDC2007T09 |
ISI Chinese-English Automatically Extracted Parallel Text |
| LDC2004S05 |
ISL Meeting Speech Part 1 |
| LDC2004T10 |
ISL Meeting Transcripts Part 1 |
| LDC95T8 |
Japanese Business News Text |
| LDC99T34 |
Japanese Business News Text Supplement |
| LDC2009T08 |
Japanese Web N-gram Version 1 |
| LDC96S64-1 |
JEIDA/JCSD-Channel 0 City Names |
| LDC96S64 |
JEIDA/JCSD-Channel 0 Complete |
| LDC96S64-2 |
JEIDA/JCSD-Channel 0 Control Words |
| LDC96S64-4 |
JEIDA/JCSD-Channel 0 Four Digit Sequences |
| LDC96S64-3 |
JEIDA/JCSD-Channel 0 Isolated Digits |
| LDC96S64-5 |
JEIDA/JCSD-Channel 0 Mono Syllables |
| LDC96S65-1 |
JEIDA/JCSD-Channel 1 City Names |
| LDC96S65 |
JEIDA/JCSD-Channel 1 Complete |
| LDC96S65-2 |
JEIDA/JCSD-Channel 1 Control Words |
| LDC96S65-4 |
JEIDA/JCSD-Channel 1 Four Digit Sequences |
| LDC96S65-3 |
JEIDA/JCSD-Channel 1 Isolated Digits |
| LDC96S65-5 |
JEIDA/JCSD-Channel 1 Mono Syllables |
| LDC98T32 |
JURIS |
| LDC95S22 |
KING Speaker Verification |
| LDC2004L01 |
Klex: Finite-State Lexical Transducer for Korean |
| LDC2006S42 |
Korean Broadcast News Speech |
| LDC2006T14 |
Korean Broadcast News Transcripts |
| LDC2002T26 |
Korean English Treebank Annotations |
| LDC2000T45 |
Korean Newswire |
| LDC2006T03 |
Korean Propbank |
| LDC2003S07 |
Korean Telephone Conversations Complete Set |
| LDC2003L02 |
Korean Telephone Conversations Lexicon |
| LDC2003S03 |
Korean Telephone Conversations Speech |
| LDC2003T08 |
Korean Telephone Conversations Transcripts |
| LDC2006T09 |
Korean Treebank Annotations Version 2.0 |
| LDC2009T10 |
Language Understanding Annotation Corpus |
| LDC95S28 |
LATINO-40 Spanish Read News |
| LDC2008S08 |
LDC Spoken Language Sampler |
| LDC2007S01 |
Levantine Arabic Conversational Telephone Speech |
| LDC2007T01 |
Levantine Arabic Conversational Telephone Speech, Transcripts |
| LDC2005S14 |
Levantine Arabic QT Training Data Set 4 (Speech + Transcripts) |
| LDC2006S29 |
Levantine Arabic QT Training Data Set 5, Speech |
| LDC2006T07 |
Levantine Arabic QT Training Data Set 5, Transcripts |
| LDC98S68 |
LLHDB |
| LDC94S21 |
MACROPHONE |
| LDC2007S09 |
Mandarin Affective Speech |
| LDC95T13 |
Mandarin Chinese News Text |
| LDC2005L01 |
Mawukakan Lexicon |
| LDC2003T13 |
Message Understanding Conference (MUC) 6 |
| LDC96T10 |
Message Understanding Conference (MUC) 6 Additional News Text |
| LDC2001T02 |
Message Understanding Conference (MUC) 7 |
| LDC2006S33 |
Middle East Technical University Turkish Microphone Speech v 1.0 |
| LDC2007T19 |
MITRE 1997 Mandarin Broadcast News Speech Translations(Hub-4NE) |
| LDC2004T03 |
Morphologically Annotated Korean Text |
| LDC2003T18 |
Multiple-Translation Arabic (MTA) Part 1 |
| LDC2005T05 |
Multiple-Translation Arabic (MTA) Part 2 |
| LDC2003T17 |
Multiple-Translation Chinese (MTC) Part 2 |
| LDC2004T07 |
Multiple-Translation Chinese (MTC) Part 3 |
| LDC2006T04 |
Multiple-Translation Chinese (MTC) Part 4 |
| LDC2002T01 |
Multiple-Translation Chinese Corpus |
| LDC2006S13 |
N4 NATO Native and Non-Native Speech |
| LDC2007S15 |
Nationwide Speech Project |
| LDC2004S09 |
NIST Meeting Pilot Corpus Speech |
| LDC2004T13 |
NIST Meeting Pilot Corpus Transcripts and Metadata |
| LDC2008T23 |
NomBank v 1.0 |
| LDC95T21 |
North American News Text Corpus |
| LDC98T30 |
North American News Text Supplement |
| LDC2008T15 |
North American News Text, Complete |
| LDC2008T16 |
North American News Text, General Release |
| LDC93S2 |
NTIMIT |
| LDC2009T26 |
NXT Switchboard Annotations |
| LDC94S17 |
OGI Multilanguage Corpus |
| LDC94S18 |
OGI Spelled and Spoken Word |
| LDC2007T21 |
OntoNotes Release 1.0 |
| LDC2008T04 |
OntoNotes Release 2.0 |
| LDC2009T24 |
OntoNotes Release 3.0 |
| LDC2008T05 |
Penn Discourse Treebank Version 2.0 |
| LDC2008T20 |
PennBioIE CYP 1.0 |
| LDC2008T21 |
PennBioIE Oncology 1.0 |
| LDC95S27 |
PhoneBook: NYNEX Isolated Words |
| LDC99T40 |
Portuguese Newswire Text |
| LDC2004T23 |
Prague Arabic Dependency Treebank 1.0 |
| LDC2004T25 |
Prague Czech-English Dependency Treebank 1.0 |
| LDC2001T10 |
Prague Dependency Treebank 1.0 |
| LDC2006T01 |
Prague Dependency Treebank 2.0 |
| LDC2004T14 |
Proposition Bank I |
| LDC2009T11 |
REFLEX Entity Translation Training/DevTest |
| LDC93S3A |
Resource Management Complete Set 2.0 |
| LDC93S3B |
Resource Management RM1 2.0 |
| LDC93S3C |
Resource Management RM2 2.0 |
| LDC93S11 |
Road Rally |
| LDC2002T07 |
RST Discourse Treebank |
| LDC2004S08 |
RT-03 MDE Training Data Speech |
| LDC2004T12 |
RT-03 MDE Training Data Text and Annotations |
| LDC2005S16 |
RT-04 MDE Training Data Speech |
| LDC2005T24 |
RT-04 MDE Training Data Text/Annotations |
| LDC2006S34 |
Russian through Switched Telephone Network (RuSTeN) |
| LDC2003T10 |
SAID |
| LDC2000S85 |
Santa Barbara Corpus of Spoken American English Part I |
| LDC2003S06 |
Santa Barbara Corpus of Spoken American English Part II |
| LDC2004S10 |
Santa Barbara Corpus of Spoken American English Part III |
| LDC2005S25 |
Santa Barbara Corpus of Spoken American English Part IV |
| LDC2003T15 |
SLX Corpus of Classic Sociolinguistic Interviews |
| LDC2006T12 |
Spanish Gigaword First Edition |
| LDC2009T21 |
Spanish Gigaword Second Edition |
| LDC95T9 |
Spanish News Text |
| LDC99T41 |
Spanish Newswire Text, Volume 2 |
| LDC2006S30 |
Speech Controlled Computing |
| LDC2000S96 |
Speech in Noisy Environments (SPINE) Evaluation Audio |
| LDC2000T54 |
Speech in Noisy Environments (SPINE) Evaluation Transcripts |
| LDC2000S87 |
Speech in Noisy Environments (SPINE) Training Audio |
| LDC2000T49 |
Speech in Noisy Environments (SPINE) Training Transcripts |
| LDC2001S04 |
Speech in Noisy Environments (SPINE2) Part 1 Audio |
| LDC2001T05 |
Speech in Noisy Environments (SPINE2) Part 1 Transcripts |
| LDC2001S06 |
Speech in Noisy Environments (SPINE2) Part 2 Audio |
| LDC2001T07 |
Speech in Noisy Environments (SPINE2) Part 2 Transcripts |
| LDC2001S08 |
Speech in Noisy Environments (SPINE2) Part 3 Audio |
| LDC2001T09 |
Speech in Noisy Environments (SPINE2) Part 3 Transcripts |
| LDC2001S99 |
Speech in Noisy Environments 1 (SPINE1 CODED) Coded Audio |
| LDC94S15 |
SPIDRE |
| LDC2008S03 |
STC-TIMIT 1.0 |
| LDC2003T16 |
SummBank 1.0 |
| LDC99S78 |
SUSAS |
| LDC99T33 |
SUSAS Transcripts |
| LDC2001S13 |
Switchboard Cellular Part 1 Audio |
| LDC2001S15 |
Switchboard Cellular Part 1 Transcribed Audio |
| LDC2001T14 |
Switchboard Cellular Part 1 Transcription |
| LDC2004S07 |
Switchboard Cellular Part 2 Audio |
| LDC93S8 |
Switchboard Credit Card |
| LDC97S62 |
Switchboard-1 Release 2 |
| LDC93T4 |
Switchboard-1 Transcripts |
| LDC98S75 |
Switchboard-2 Phase I |
| LDC99S79 |
Switchboard-2 Phase II |
| LDC2002S06 |
Switchboard-2 Phase III Audio |
| LDC2001T60 |
Syllable-Final /s/ Lenition |
| LDC99S83 |
Tactical Speaker Identification Speech Corpus (TSID) |
| LDC2007T03 |
Tagged Chinese Gigaword |
| LDC2009T14 |
Tagged Chinese Gigaword Version 2.0 |
| LDC98S72 |
Taiwanese Putonghua Speech and Transcripts |
| LDC2004S12 |
TalkBank Ethology Data: Field Recordings of Vervet Monkey Calls |
| LDC98T25 |
TDT Pilot Study Corpus |
| LDC2000S92 |
TDT2 Careful Transcription Audio |
| LDC2000T44 |
TDT2 Careful Transcription Text |
| LDC99S84 |
TDT2 English Audio |
| LDC2001S93 |
TDT2 Mandarin Audio Corpus |
| LDC2001T57 |
TDT2 Multilanguage Text Version 4.0 |
| LDC2001S94 |
TDT3 English Audio |
| LDC2001S95 |
TDT3 Mandarin Audio |
| LDC2001T58 |
TDT3 Multilanguage Text Version 2.0 |
| LDC2005S11 |
TDT4 Multilingual Broadcast News Speech Corpus |
| LDC2005T16 |
TDT4 Multilingual Text and Annotations |
| LDC2006T18 |
TDT5 Multilingual Text |
| LDC2006T19 |
TDT5 Topics and Annotations |
| LDC2002T31 |
The AQUAINT Corpus of English News Text |
| LDC97S63 |
The CMU Kids Corpus |
| LDC2008T19 |
The New York Times Annotated Corpus |
| LDC93S9 |
TI 46-Word |
| LDC2004T09 |
TIDES Extraction (ACE) 2003 Multilingual Training Data |
| LDC93S10 |
TIDIGITS |
| LDC2006T08 |
TimeBank 1.2 |
| LDC93S1 |
TIMIT Acoustic-Phonetic Continuous Speech Corpus |
| LDC93T3A |
TIPSTER Complete |
| LDC93T3B |
TIPSTER Volume 1 |
| LDC93T3C |
TIPSTER Volume 2 |
| LDC93T3D |
TIPSTER Volume 3 |
| LDC95S25 |
TRAINS Spoken Dialog Corpus |
| LDC2002S04 |
Translanguage English Database (TED) Speech |
| LDC2002T03 |
Translanguage English Database (TED) Transcripts |
| LDC2000T52 |
TREC Mandarin |
| LDC2000T51 |
TREC Spanish |
| LDC2007V02 |
TRECVID 2003 Keyframes & Transcripts |
| LDC2007V01 |
TRECVID 2005 Keyframes & Transcripts |
| LDC95T7 |
Treebank-2 |
| LDC99T42 |
Treebank-3 |
| LDC94T4A |
UN Parallel Text (Complete) |
| LDC94T4B-1 |
UN Parallel Text (English) |
| LDC94T4B-2 |
UN Parallel Text (French) |
| LDC94T4B-3 |
UN Parallel Text (Spanish) |
| LDC2009T07 |
Unified Linguistic Annotation Text Collection |
| LDC99S82 |
USC Marketplace Broadcast News Speech |
| LDC99T36 |
USC Marketplace Broadcast News Transcripts |
| LDC96S41 |
VAHA (POLYPHONE II) |
| LDC2000S89 |
Voice of America (VOA) Czech Broadcast News Audio |
| LDC2000T53 |
Voice of America (VOA) Czech Broadcast News Transcripts |
| LDC98S77 |
Voicemail Corpus Part I |
| LDC2002S35 |
Voicemail Corpus Part II |
| LDC2006T13 |
Web 1T 5-gram Version 1 |
| LDC2009T25 |
Web 1T 5-gram, 10 European Languages Version 1 |
| LDC2002S02 |
West Point Arabic Speech |
| LDC2008S04 |
West Point Brazilian Portuguese Speech |
| LDC2005S30 |
West Point Company G3 American English Speech |
| LDC2005S28 |
West Point Croatian Speech |
| LDC2006S37 |
West Point Heroico Spanish Speech |
| LDC2006S36 |
West Point Korean Speech |
| LDC2003S05 |
West Point Russian Speech |
| LDC95S24 |
WSJCAM0 Cambridge Read News |
| LDC94S16 |
YOHO Speaker Verification |
Your search was: .
Total: 453 Matching Publication(s) Found.