Introduction
This file contains documentation on the 2002 Emotional Prosody Speech and
Transcripts, Linguistic Data Consortium (LDC) catalog number LDC2002S28 and
ISBN 1-58563-237-6.
This publication contains audio recordings and corresponding transcripts,
collected over an eight month period in 2000-2001 and designed to support
research in emotional prosody. The recordings consist of professional actors
reading a series of semantically neutral utterances (dates and numbers)
spanning fourteen distinct emotional categories, selected after Banse &
Scherer's study of vocal emotional expression in German. (Banse, R. & Scherer,
K. R. 1996. Acoustic profiles in vocal emotion expression. Journal of
Personality and Social Psychology, 70, 614-636.)
Actor participants were provided with descriptions of each emotional
context, including situational examples adapted from those used in the original
German study. Flashcards were used to display series of four-syllable dates and
numbers to be uttered in the approriate emotional category.
The Prosody Recordings Project is interested in capturing the aspects of
speech (emotion, intonation) that are left out of the written form of a
message. In these experiments, simple phrases are expressed in ways that
reflect varied contexts. The same phrase might be used to answer different
questions, address listeners at different distances from the speaker, or
express different emotional states. Actors were used because they are experts
at producing this kind of contextual variation in a natural and convincing way.
More information about this project can be found at
http://www.ldc.upenn.edu/Projects/Prosody/.
Data
There are 30 data files: 15 recordings in sphere format and their
transcripts. For a sample transcript, please click on this example.
The sphere files are encoded in two-channel interleaved 16-bit PCM,
high-byte-first ("big-endian") format, for a total of 2,912,067,980 bytes (2777
Mbytes) or nine hours of sphere data.
The utterences were recorded directly into WAVES+ datafiles, on two channels
with a sampling rate of 22.05K. The two microphones used were a stand-mounted
boom Shure SN94 and a headset Seinnheiser HMD 410.
The original session recordings are provided in their entirety,
including informal chit-chat and discussion between each emotion category
elicitation task. Time alignment is limited to utterances within the formal
elicitation tasks and miscellanous regions have been marked as such.
Samples
Updates
There are no updates at this time.
Content Copyright
Portions © 2000-2002 Trustees of the University of Pennsylvania. |