September 2010 Newsletter

Friday, September 17, 2010

New Corpora

ACE Time Normalization (TERN) 2004 English Evaluation Data V1.0

Korean Newswire Second Edition

NIST 2006 Open Machine Translation (OpenMT) Evaluation

Announcements

Fall 2010 LDC Data Scholarship Winners
LDC is pleased to announce the winners in our first-ever LDC Data Scholarship program!  The LDC Data Scholarship program provides university students with access to LDC data at no-cost.  Data scholarships are offered twice a year to correspond to the Fall and Spring semesters.  Students are asked to complete an application which consists of a data use proposal and letter of support from their academic adviser.   

LDC received many strong applications from both undergraduate and graduate students attending universities across the globe.  The decision process was difficult, and after much deliberation, we have selected eight winners!   These students will receive no-cost copies of LDC data valued at over US$10,000:

Aby Abraham - Ohio University (USA), graduate student, Electrical Engineering.  Aby has been awarded a copy of 2003 NIST Speaker Recognition Evaluation (LDC2010S03) for his work in using long term memory cells for continuous speech recognition.

Ripandy Adha - Bandung Institute of Technology (Indonesia), undergraduate student, Computer Science - Ripandy has been awarded a copy of American English Spoken Lexicon (LDC99L23) to assist in the development of a voice command internet browser.

Basawaraj - Ohio University (USA), PhD candidate, Electrical Engineering and Computer Science.  Basawaraj has been awarded a copy of NIST 2002 Open Machine Translation (OpenMT) Evaluation (LDC2010T10) to assist in fine tuning his machine translation system and to provide a benchmark dataset.

Zachary Brooks - University of Arizona (USA), PhD Candidate, Second Language Acquisition and Teaching.  Zachary and his research group have been awarded a copy of ECI Multilingual Text (LDC94T5) for research in eye movement tracking by native and non-natives readers.

Marco Carmosino - Hampshire College (USA), undergraduate student, Computer Science.  Marco has been awarded a copy of English Gigaword Fourth Edition (LDC2009T13) for his work in narrative chain extraction.

Xiaohui Huang - Harbin Institute of Technology (China), Shenzhen Graduate School.  Xiaohui has been awarded a copy of TDT5 Topics and Annotations (LDC2006T19)  for his work in topic detection and tracking for large-scale web  data.

Yuhuan Zhou - PLA University of Science and Technology (China), postgraduate student, Institute of Communications Engineering.  Yuhuan has been awarded a copy of 2002 NIST Speaker Recognition Evaluation (LDC2004S04) to assist in the development of a speaker recognition system which fuses support vector data description (SVDD) and Gaussian mixture model (GMM).

Speaker Recognition Group (GEDA) with members Matias Fineschi, Gonzalo Lavigna, Jorge Prendes, and Pablo Vacatello -  Buenos Aires Institute of Technology (Argentina), Department of Electrical Engineering.  GEDA has been awarded a copy of 2004 NIST Speaker Recognition Evaluation (LDC2006S44) to assist in the development of a flexible platform on speaker verification capable of implementing different feature extraction, normalizations, stochastical models and outputs.

Please join us in congratulating our student winners!   The next LDC Data Scholarship program is scheduled for the Spring 2011 semester. Stay tuned for further announcements.