Mandarin Affective Speech is a database of emotional speech consisting of audio
recordings and corresponding transcripts collected in 2005 at the Advance Computing
and System Laboratory, College of Computer Science and Technology, Zhejiang
University, Hangzhou, People's Republic of China. This corpus was designed with
two goals: first, to serve as a tool for linguistic and prosodic feature investigation
of emotional expression in Mandarin Chinese; and second, to provide a source
of training and test data essential to support research in speaker recognition
with affective speech. The speech database was recorded by eliciting speakers
to express different emotional states in response to stimuli. The speakers read
scenarios designed to elicite an emotional response such as a colleague's mistake
for anger, a pleasant trip for elation, a hurry-up scene for panic and a puppy's
death for sadness. The five emotional states recorded are characterized as follows:
- Neutral - Simple statements without any emotion.
- Anger - A strong feeling of displeasure or hostility.
- Elation - Be glad or happy because of praise.
- Panic - A sudden, overpowering terror, often affecting many people at once.
- Sadness - Affected or characterized by sorrow or unhappiness
Over 100 speakers participated in the data collection. After screening, recordings
from 68 speakers (23 females, 45 males) were used in this corpus. Most of the
speakers were in their twenties at the time of collection. Information about
the speakers is contained in "SpeakerInfo.doc."
Subjects were given a text to read that consisted of five phrases, fifteen sentences and two paragraphs designed to generate the emotional speech. The material included all the phonemes in Mandarin. Each subject read the phrases, paragraphs, and sentences portraying the five emotional states: neutral (unemotional), anger, elation, panic and sadness. Altogether this database contains 25,636 utterances. The read material was constructed as follows:
- 5 phrases - "yes", "no" and three nouns as "apple", "train", "tennis ball". In Chinese, these words contain many different basic vowels and consonants.
- 20 sentences - These sentences include all the phonemes and most common consonant clusters in Mandarin. The types of sentences are: simple statements, a declarative sentence with an enumeration, general questions (yes/no question), alternative questions, imperative sentences, exclamatory sentences, special questions (whquestions).
- 2 paragraphs - Two readings, one selected from a famous Chinese novel, and the other stating a normal fact.
All the data were recorded in a quiet office on an OLYMPUS DM-20 digital voice
recorder with a sampling rate of 22050Hz. Afterwards, the recorded voice files
were transferred to a personal computer by USB (Universal Serial Bus). The recordings
were then converted into monophonic Windows PCM format at 8 kHz sampling frequency
and 16 bits resolution.
Further information about the data and methodology in this corpus is contained
in the authors' paper, "MASC: A Speech Corpus in Mandarin for Emotional
Analysis and Affective Speaker Recognition," in "MASC.pdf."
For an example of the data in this corpus, please listen the following examples:
Portions © 2005-2007, Zhejiang University, Advance Computing and System
Laboratory, © 2007 Trustees of the University of Pennsylvania