Obtaining DataUsing DataProviding DataCreating Data
About LDCMembersCatalogProjectsPapersLDC OnlineSearchContact UsUPennHome

LDC Catalog | By Type and Source | By Year | Top Ten | Projects | Catalog Search



Voicemail Corpus Part I

Item Name: Voicemail Corpus Part I
Authors: M. Padmanabhan, G. Ramaswamy, B. Ramabhadran, P. S. Gopalakrishnan and C. Dunn
LDC Catalog No.: LDC98S77
ISBN: 1-58563-141-8
Data Type: speech
Sample Rate: 8000 Hz
Sampling Format: 1-channel ulaw
Data Source(s): telephone speech
Application(s): speech recognition
Language(s): English
Language ID(s): eng
Distribution: 1 CD
Member fee: $0 for 1998 members
Non-member Fee: N/A (Members Only)
Reduced-License Fee: N/A
Extra-Copy Fee: US $150.00
Online documentation: yes
Licensing Instructions: Subscription Members, Standard Members, Non-Members
Citation: M. Padmanabhan, et al.
1998
Voicemail Corpus Part I
Linguistic Data Consortium, Philadelphia

Introduction

This corpus was created by:
M. Padmanabhan, G. Ramaswamy, B. Ramabhadran, P. S. Gopalakrishnan and C. Dunn

Data

This CD-ROM corpus consists of 1,801 messages, collected from volunteers at various IBM sites in the United States, comprising the training data set and 42 messages in the development test set. The average voicemail message is 31 seconds in duration and has about 100 words. Approximately 38% of the messages correspond to male speakers; the remainder correspond to females. All messages were transcribed by IBM.

Updates

There are no updates at this time.

Copyright

Portions © 1998 International Business Machines Corporation, © 1998 Trustees of the University of Pennsylvania

Pricing

The Reduced Licensing Fee for this corpus is US$150.


About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Data

Contact: ldc@ldc.upenn.edu

(c) 1992-2010 Linguistic Data Consortium, University of Pennsylvania. All Rights Reserved.