LDC Catalog
|
By Type and Source
|
By Year
|
Top Ten
|
Projects
|
Catalog Search
Search the LDC Catalog
Publication Name:
Author:
Catalog Number:
Find keywords in corpus description:
Language(s):
Albanian
Arabic
Bengali
Berber
Bulgarian
Canadian French
Cantonese
Catalan
Chinese
Croatian
Czech
Danish
Dari
Dutch
Egyptian Arabic
English
Estonian
Farsi
French
Gaelic
Georgian
German
German Sign Language
Gulf Arabic
Gullah
Hindi
Hungarian
Indian English
Indonesian
Iraqi Arabic
Italian
Japanese
Khmer
Korean
Kumarbhag Paharia
Kurdish
Lao
Latin
Levantine Arabic
Lithuanian
Lucumi
Mahou
Mal Paharia
Mandarin Chinese
Mesopotamian Arabic
Min Nan Chinese
Modern Greek
Modern Standard Arabic
Ngomba
North Levantine Arabic
North Mesopotamian Arabic
Northern Uzbek
Norwegian
Norwegian Bokmaal
Norwegian Nynorsk
Pashto
Polish
Portuguese
Punjabi
Putonghua
Romanian
Russian
Sanskrit
Sauria Paharia
Serbian
Slovenian
South Levantine Arabic
Spanish
Standard Malay
Swahili
Swedish
Tagalog
Taiwan Mandarin
Tamil
Thai
Tigrinya
Trinidadian
Turkish
Urdu
Uzbek
Vervet Monkey Calls
Vietnamese
Western Farsi
Wu Chinese
Yemba
Yoruba
Yue Chinese
Member year(s):
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
Corpus type(s):
lexicon
lexicon, speech, text
speech
speech, text
text
video
Data source(s):
broadcast conversation
broadcast news
dictionaries
email
fiction
field recordings
government documents
journal articles
journal entries
meeting speech
microphone conversation
microphone speech
news magazine
newsgroups
newswire
question-answers
reviews
telephone conversations
telephone speech
text chat conversations
transcribed speech
varied
video
web collection
weblogs
Research project(s):
ACE
American National Corpus (ANC)
AQUAINT
ATIS
Communicator
CoNLL
DARPA-CSR
DASL
EARS
GALE
GENOA
Hub4
Hub5-LVCSR
JANUS
LID
Linguistic Atlas Project
MADCAT
MALACH
MT08
MUC
NIST Automatic Meeting Recognition
NIST LRE
NIST MT
NIST SRE
OntoNotes
REFLEX-MTE
RM
ROAR
RT
SemEval
SID
SPINE
Talkbank
TDT
TERN
TIDES
Tipster
TREC
VACE
Recommended application(s):
automatic content extraction
bibliometrics
content-based retrieval
coreference resolution
cross-lingual information retrieval
diarization
discourse analysis
discourse parsing
distillation
entity extractio
event detection
finite state technology
gesture recognition
gesture synthesis
handwriting recognition
information detection
information extraction
information retrieval
instruction
language generation
language identification
language modeling
language teaching
linguistic analysis
machine learning
machine translation
meeting summarization
message understanding
metadata extraction
morphology
morphology learning
named entity recognition
natural language processing
nominal expression generation
parsing
part of speech tagging
phonetics
phonology
pragmatics
pronunciation modeling
prosody
psycholinguistics
question-answering
Relation Extraction
semantic role labelling
sociolinguistics
spatial analysis
speaker identification
speaker segmentation and tracking
speaker verification
speech recognition
speech synthesis
spoken dialogue modeling
spoken dialogue systems
spoken term detection
standards
subjectivity analysis
summarization
syntactic parsing
tagging
temporal analysis
topic detection and tracking
Search Options:
Within Fields
or
and
Between Fields
and
or
The above ten (10) criteria are available for searching the Catalog Database. Any criteria left blank are ignored. For the text fields, you may enter full or partial names, and use the underscore ('_') character for any character you are unsure of. More information about LDC Catalog Numbers is available on the
Search By Year
page.
For criteria in the selection boxes above, you have the option to select more than one value
within the field
, and whether or not the search should return publications with at least one of the values (eg. Membership Year = 1996
OR
1998), which is the default behavior, or only those publications with all of the chosen values (eg. Membership Year = 1996
AND
1998). NOTE: A Publication can have only one data source, so "
OR
" will be assumed if you select more than one value.
If you wish to search by more than one criteria (
between fields
), you can have the search return only those publications that match criteria set for all of the fields (eg. Membership Year = 1996
AND
Language = English), which is the default behavior, or publications that match criteria for at least one of the fields (eg. Membership Year = 1996
OR
Language = English)