This corpus was created by:
M. Padmanabhan, G. Ramaswamy, B. Ramabhadran, P. S.
Gopalakrishnan and C. Dunn
This CD-ROM corpus consists of 1,801 messages, collected from
volunteers at various IBM sites in the United States, comprising
the training data set and 42 messages in the development test
set. The average voicemail message is 31 seconds in duration
and has about 100 words. Approximately 38% of the messages
correspond to male speakers; the remainder correspond to
females. All messages were transcribed by IBM.
There are no updates at this time.
Portions © 1998 International Business Machines Corporation, © 1998 Trustees of the University of Pennsylvania
The Reduced Licensing Fee for this corpus is US$150.