September 2013 Newsletter

New LDC Website Coming Soon

Look for LDC's new website in the coming weeks. We've revamped the design and site plan to make it easier than ever to find what you're looking for. The features you use the most -- the catalog, new corpus releases and user login -- will be a short click away. We expect the LDC website to be occasionally unavailable for a few days at the end of September as we make the switch and thank you in advance for your understanding.

 

September 2013 Newsletter

New LDC Website Coming Soon

Look for LDC's new website in the coming weeks. We've revamped the design and site plan to make it easier than ever to find what you're looking for. The features you use the most -- the catalog, new corpus releases and user login -- will be a short click away. We expect the LDC website to be occasionally unavailable for a few days at the end of September as we make the switch and thank you in advance for your understanding.

 

September

New LDC Website Coming Soon

Look for LDC's new website in the coming weeks. We've revamped the design and site plan to make it easier than ever to find what you're looking for. The features you use the most -- the catalog, new corpus releases and user login -- will be a short click away. We expect the LDC website to be occasionally unavailable for a few days at the end of September as we make the switch and thank you in advance for your understanding.

 

New Corpora

Arabic, Chinese & English Web Text for Information Retrieval: BOLT Information Retrieval Comprehensive Training and Evaluation: all data produced by LDC in support of the DARPA BOLT IR task including annotations, source documents and scoring software

Task Specifications

BOLT developed technology that enables English speakers to retrieve and understand information from informal foreign language sources including chat, text messaging and spoken conversations. The genres of interest to BOLT were characterized by inherent variation and inconsistency, motivating the development of new collection and annotation methods. 

HAVIC

Heterogeneous Audio Visual Internet Collection (HAVIC)

LDC built a large corpus of multi-modal data to support research in a variety of areas including spoken term detection and video event detection. The HAVIC (Heterogeneous Audio Visual Internet Collection) Corpus consists of thousands of hours of “real world” video data collected from the internet. The corpus especially targeted user-generated video content as opposed to professionally-produced or commercial video content.

Pages

Subscribe to Linguistic Data Consortium RSS