December 2016 Newsletter

Thursday, December 15, 2016

New Corpora

Bamanankan Lexicon

IARPA Babel Tagalog Language Pack IARPA-babel106-v0.2g

TAC KBP Spanish Cross-lingual Entity Linking - Comprehensive Training and Evaluation Data 2012-2014


Renew your LDC membership today

Membership Year 2017 (MY2017) is open for joining and discounts are available for those who keep their membership current and join early in the year. Now through March 1, 2017, current MY2016 members who renew before March 1, will receive a 10% discount off of the membership fee. New or returning organizations will receive a 5% discount through March 1. 

In addition to receiving new publications, current year LDC members also enjoy the benefit of licensing older data at reduced costs from our Catalog of almost 700 holdings; current year for-profit members may use most data for commercial applications.

Plans for MY2017 publications are in progress. Among the expected releases are:

  • 2010 NIST Speaker Recognition Evaluation data set
  • Multilanguage conversational telephone speech: developed to support language identification research in related languages
  • UCLA High Speed Laryngeal Database: audio recordings and high-speed video endoscopic images of the vocal folds while sustaining vowels
  • Noisy TIMIT: TIMIT with added artificial noise
  • CHiME shared task data: noisy read WSJ speech
  • First Year Law Students’ Memoranda: memos to a hypothetical court with annotations
  • IARPA Babel Language Packs: languages include Vietnamese, Haitian Creole, Zulu, Kazakh and Lithuanian
  • BOLT: source, parallel and word-aligned data in all languages
  • RATS Keyword Spotting data set
  • GALE Phases 3 and 4: all tasks and languages   

And don’t forget, MY2016 and MY2015 are still open for joining. MY2015 can be joined through December 31, 2016 and includes data such as RATS Speech Activity Detection and updates to Penn Treebank. MY 2016 will remain open through December 31, 2017 and includes data such as BOLT Chinese Discussion Forums, IARPA Babel Language Packs and Multi-Language Conversational Telephone Speech – Slavic Group. For full descriptions of these data sets, visit our Catalog.  

Visit Join LDC for details on membership, user accounts and payment.

Spring 2017 LDC Data Scholarship Program - deadline approaching

Students can apply for the Spring 2017 Data Scholarship Program now through January 16, 2017, 11:59PM EST. The LDC Data Scholarship program provides undergraduate and graduate students with access to LDC data at no cost.

For more information on application requirements and program rules, please visit LDC Data Scholarships. Students can email their applications to the LDC Data Scholarships program. Decisions will be sent by email from the same address.

LDC to close for Winter Break

LDC will be closed from Monday, December 26, 2016 through Monday, January 2, 2017 in accordance with the University of Pennsylvania Winter Break Policy. Our offices will reopen on Tuesday, January 3, 2017. Requests received for membership renewals and corpora during the Winter Break will not be processed until the week of January 3.