Join LDC for Membership Year 2019

Membership Year 2019 (MY2019) is open and discounts are available for those who keep their membership current and join early in the year. Now through March 1, 2019, current MY2018 members who renew their LDC membership before March 1 will receive a 10% discount off the membership fee. New or returning organizations will receive a 5% discount through March 1. 

In addition to receiving new publications, current LDC members also enjoy the benefit of licensing older data at reduced costs from our Catalog of over 750 holdings. Current-year for-profit members may use most data for commercial applications. 

Plans for MY2019 publications are in progress. Among the expected releases are:

  • SRI Speech-Based Collaborative Learning Corpus: speech from over 100 US middle school students performing collaborative learning tasks, includes audio recordings, orthographic transcriptions, manual annotation of collaboration, and related documentation
  • Chinese Abstract Meaning Representation (AMR): developed by Nanjing Normal University and Brandeis University, semantic representation of approximately 10,000 Chinese sentences following the basic principles of AMR using web source data from Chinese Treebank 8.0 (LDC2013T21)
  • Multilanguage conversational telephone speech: developed to support language identification research in related languages (Arabic, East Asian, English, Mandarin)
  • TAC KBP: English entity discovery and linking, nugget detection and event argument data, Chinese slot-filling data
  • CALLFRIEND Second Edition: updated releases with .wav format audio, simplified directory structure and enhanced documentation and metadata (English, Egyptian Arabic, Mandarin Chinese-Taiwan)
  • HAVIC Med Progress Test data: English web video, metadata, and annotations for developing multimedia systems
  • IARPA Babel Language Packs (telephone speech and transcripts): languages include Amharic, Guarani, Igbo, and Lithuanian
  • BOLT: discussion forums, SMS, word-aligned and tagged data in all languages (Chinese, Egyptian Arabic, English)

And, it’s not too late to join for MY2017 (through December 31, 2018) and MY2018 (through December 31, 2019). Data sets from those years include 2010 NIST Speaker Recognition Evaluation Test Set, RATS Keyword Spotting and Language Identification releases, CHiME, Noisy TIMIT Speech, Concretely Annotated New York Times and English Gigaword, DIRHA English WSJ Audio, LORELEI Amharic and Somali Language Packs and DEFT Spanish Treebank. For full descriptions of all LDC data sets, browse our Catalog.  

Visit Join LDC for details on membership, user accounts and payment.