Obtaining DataUsing DataProviding DataCreating Data
About LDCMembersCatalogProjectsPapersLDC OnlineSearchContact UsUPennHome


Past Projects

The LDC has been involved in a number of projects to support language education, research and technology development.

CallFriend Farsi - LDC has just completed the auditing, segmentation and transcription of 10 minutes of each of 60 telephone calls with a pronouncing lexicon to support speech recognition of conversational Farsi.

Emotional Prosody - The LDC has recently created and published a small corpus to support emotional prosody research. The data consists of recordings and transcripts of professional actors reading a series of semantically neutral utterances (dates and numbers) spanning fourteen distinct emotional categories.

SMART Source Media Authoring Resources and Tools.

SWB Cellular Transcription - LDC will transcribe (5) minutes of each of 250 different conversations (500 sides) compiled from the SWB Cellular Phase I (GSM) collection. This will be done for Speaker Identification so that the transcriptions can be used for speech-to-text systems R&D and evaluation under conditions of vocoded speech.

SPINE2 - LDC is preparing and transcribing audio files to support the second phase of speech recognition in noisy environments, having completed work on SPINE 1.

Switchboard Cellular Phase II - LDC is just completing a small switchboard collection in which each of 210 speakers participates in an average of 10 telephone calls. The data can be used to support research in speaker verification or speech recognition.


About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Data

Contact ldc@ldc.upenn.edu
Last modified: Tuesday, 29-Jul-2003 17:06:06 EDT
© 1992-2007 Linguistic Data Consortium, University of Pennsylvania. All Rights Reserved.