Introduction
The Santa Barbara Corpus of Spoken American English is based on
hundreds of recordings of natural speech from all over the United
States, representing a wide variety of people of different regional
origins, ages, occupations, and ethnic and social backgrounds. It
reflects many ways that people use language in their lives:
conversation, gossip, arguments, on-the-job talk, card games, city
council meetings, sales pitches, classroom lectures, political
speeches, bedtime stories, sermons, weddings, and more.
Data
The three CD-ROM volumes in Part I contain 14 speech files of between
15-30 minutes each, from the Santa
Barbara Corpus of Spoken American English. Collected by: University of
California, Santa Barbara Center for the Study of Discourse, Director John
W. Du Bois (UCSB), Associate Editors: Wallace L. Chafe (UCSB), Charlese Meyer
(UMass, Boston), and Sandra A. Thompson (UCSB). The Santa Barbara Corpus of
Spoken American English is part of the International Corpus of English (Charles
W. Meyer, Director), representing the American Component.
Each speech file is accompanied by a transcript in which phrases are
time stamped with respect to the audio recording. Personal names,
place names, phone numbers, etc., in the transcripts have been altered
to preserve the anonymity of the speakers and their acquaintances and
the audio files have been filtered to make these portions of the
recordings unrecognizable.
For the latest information on this corpus, please refer to the following sites devoted to it:
http://
http://www.linguistics.ucsb.edu/research/sbcorpus.html
http://www.ldc.upenn.edu/Projects/SBCSAE
Samples
For an example of the data in this corpus, please examine these samples of the recordings and transcripts:
Updates
There are no updates at this time.
|