February 2024 Newsletter

Thursday, February 15, 2024

New Corpora 

Second Language University Speech Intelligibility Corpus

AIDA Scenario 1 Practice Topic Annotation


LDC Membership Discounts Expire March 1 
Time is running out to save on 2024 membership fees. Renew your LDC membership, rejoin the Consortium, or become a new member by March 1 to receive a discount of up to 10%. For more information on membership benefits and options, visit Join LDC.

Spring 2024 Data Scholarship Recipients 
Congratulations to the recipients of LDC’s Spring 2024 data scholarships:

Jordan Chandler: Université Rennes 2 (France): Master’s student, English Studies. Jordan is awarded a copy of Penn Parsed Corpora of Historical English LDC2020T16 to continue his research on the historical development of adjective, quantifier and article indefiniteness in the English language.

Nikhil Raghav: TCG Crest (India): PhD candidate, Institute for Advancing Intelligence. Nikhil is awarded copies of Third DIHARD Challenge Development LDC2022S12 and Third DIHARD Challenge Evaluation LDC2022S14 for his work in speaker diarization. 

Abraham Sanders: Rensselaer Polytechnical Institute (USA): PhD candidate, Cognitive Science. Abraham is awarded copies of Fisher English Training Speech Part 1 Speech LDC2004S13, Fisher English Training Speech Part 1 Transcripts LDC2004T19, Fisher English Training Part 2 Speech LDC2005S13 and Fisher English Training Part 2 Transcripts LDC2005T19, for his work in spoken dialogue systems.  

The next round of applications will be accepted in September 2024. For information about the program, visit the Data Scholarships page.

Four Corpora Withdrawn from the LDC Catalog 
We regret to announce that The New York Times Annotated Corpus LDC2008T19 has been withdrawn from the LDC Catalog by the data provider. Because they contain data from LDC2008T19, the following three corpora are also withdrawn from the Catalog: Benchmarks for Open Relation Extraction LDC2014T27, Concretely Annotated New York Times LDC2018T12, and News Sub-domain Named Entity Recognition LDC2023T12. Organizations and individuals who have previously licensed any of these data sets can continue to use them under the terms of their respective special license agreements.