March 2011 Newsletter

Wednesday, March 16, 2011

New Corpora

2008/2010 NIST Metrics for Machine Translation (MetricsMaTr) GALE Evaluation Set

NIST/USF Evaluation Resources for the VACE Program  – Meeting Data Training Set Part 1


Spring 2011 LDC Data Scholarship Recipients
LDC is pleased to announce the student recipients of the Spring 2011 LDC Data Scholarship program! The LDC Data Scholarship program provides university students with access to LDC data at no-cost. Students were asked to complete an application which consisted of a proposal describing their intended use of the data, as well as a letter of support from their thesis adviser. LDC received many solid applications from both undergraduate and graduate students attending universities across the globe.  After careful deliberation, we have chosen eight proposals to support. These students will receive no-cost copies of LDC data:

Roberto Aceves - Monterrey Institute of Technology and Superior Studies, ITESM (Mexico), graduate student, Computer Science. Roberto has been awarded a copy of the Speech in Noisy Environments (SPINE) database for his research in automatic speech recognition in noisy environments.

Daniel Escobar - Monterrey Institute of Technology and Superior Studies, ITESM (Mexico), graduate student, Mechatronics and Automation. Daniel has been awarded  a copy of Switchboard-2 and NIST SRE for designing a parallel joint factor analysis architecture for a speaker verification system.

Erhan Guven - The George Washington University (USA), graduate student, Computer Science. Erhan has been awarded a copy of Emotional Prosody (LDC2002S28) for his work in extracting speaker emotional state from spectrograms.

Anup Kolya - Jadavpur University (India), graduate student, Computer Science and Engineering. Anup has been awarded a copy of ACE 2005 English SpatialML Annotations (LDC2008T03), ACE Time Normalization (TERN) 2004 English Evaluation Data V1.0 (LDC2010T18), and ACE Time Normalization (TERN) 2004 English Training Data v 1.0 (LDC2005T07) for his research in temporal information extraction.

Benjamín Martínez Elizalde - Monterrey Institute of Technology and Superior Studies, ITESM (Mexico), graduate student, Computer Science. Benjamín has been awarded a copy of Switchboard-2 and NIST SRE  to support his research in speaker verification modeling.

Hanan Waer - Newcastle University (UK), graduate student, Educational and Applied Linguistics. Hanan has been awarded a copy of CALLHOME Egyptian Arabic Transcripts (LDC97T19), CALLHOME Egyptian Arabic Transcripts Supplement (LDC2002T38), and Egyptian Colloquial Arabic Lexicon (LDC99L22) for her research in comparing Arabic/English code switching in everyday Arabic conversation and academic discourse.

Muhua Zhu - Northeastern University (China), graduate student, Natural Language Processing. Muhua has been  awarded a copy of Chinese Treebank 7.0 (LDC2010T07) to support the development of a high-accuracy Chinese parser.

Vignesh Kalaiselvan, Ganapathy Raman Kasi, Preetham Samue, Ramsrinivas Anantharamakrishnan, and Sathyanarayan Jeevan - Amrita Vishwa Vidyapeetham University (India), undergraduate students, Electronics and Communication Engineering -  the group has been awarded CALLHOME Speech, Transcripts, and Lexicon in Egyptian Arabic and German for their research in deriving robust features for multilingual acoustic modeling.

Please join us in congratulating our student winners!   The next LDC Data Scholarship program is scheduled for the Fall 2011 semester.

LDC at NEALLT 2011
LDC will be exhibiting at the upcoming NEALLT (North East Association for Language Learning Technology) conference, which will be held at the University of Pennsylvania from 1-3 April 2011. NEALLT is the regional chapter of the International Association for Language Learning Technology and works to improve language instruction through the use of technology.

How resources developed and distributed by LDC can aid language education will be discussed by LDC’s Dr Mohamed Maamouri in the presentation “Incorporating Resources and New Technologies in Language Education” on Saturday, April 2 (Session 9: 4.00-4.20 pm, Cohen G17). That presentation will highlight the LDC Arabic Reading Enhancement Tool, designed to support the development of reading skills for learning Arabic as a first and second language.

We hope to see you there!