August 2013 Newsletter

Monday, August 19, 2013

New Corpora

GALE Phase 2 Chinese Broadcast Conversation Parallel Text Part 2

MADCAT (Multilingual Automatic Document Classification Analysis and Translation) Phase 3 Training Set

Mixer 6 Speech


Mixer 6 Speech now available

The release of Mixer 6 Speech this month marks the first time in close to a decade that LDC has made available a large-scale speech training data collection. Representing more than 15,000 hours of speech from over 500 speakers, Mixer 6 follows in the footsteps of the Switchboard and Fisher studies by providing a large database of telephone conversations with the addition of subject interviews and transcript readings. Participants were native American English speakers local to the Philadelphia area. Mixer 6 Speech is a members-only release and a great reason to join the consortium. In addition to this substantial resource, members  enjoy rights to other data released in 2013 and can license older publications at reduced fees. Please see the full description of Mixer 6 Speech.

Fall 2013 LDC Data Scholarship Program - deadline approaching

The deadline for the Fall 2013 LDC Data Scholarship Program is one month away! Student applications are being accepted now through September 16, 2013, 11:59PM EST.  The LDC Data Scholarship program provides university students with access to LDC data at no cost.  This program is open to students pursuing both undergraduate and graduate studies in an accredited college or university. LDC Data Scholarships are not restricted to any particular field of study; however, students must demonstrate a well-developed research agenda and a bona fide inability to pay.

Students will need to complete an application which consists of a data use proposal and letter of support from their adviser.  For further information on application materials and program rules, please visit the LDC Data Scholarship page.

Students can email their applications to the LDC Data Scholarship program. Decisions will be sent by email from the same address.

LDC at Interspeech 2013, Lyon France

LDC will once again be exhibiting at Interspeech held this year August 25-29 in Lyon. Please stop by LDC’s booth to to learn about recent developments at the Consortium, including new publications.

Also, be on the lookout for the following presentations:

Speech Activity Detection on YouTube Using Deep Neural Networks: Neville Ryant, Mark Liberman, Jiahong Yuan (all LDC)
    o   Monday 26 August, Poster 6,  16.00 – 18.00
    o   Room: Forum 6

The Spectral Dynamics of Vowels in Mandarin Chinese: Jiahong Yuan
    o   Tuesday 27 August, Oral 17, 14.00 – 16.00
    o   Room: Gratte-Ciel 3

Automatic Phonetic Segmentation using Boundary Models: Jiahong Yuan (LDC), Neville Ryant (LDC), Mark Liberman (LDC), Andrea Stolcke, Vikramjit Mitre, Wen Wang
    o   Wednesday 28 August, Oral 32, 14.00 – 16.00
    o   Room: Gratte-Ciel 3

LDC will continue to post conference updates via our Facebook page. We hope to see you there!