|

|
|
ATIS3 Test Data
| |
| Item Name: | ATIS3 Test Data |
| Authors: | Deborah A. Dahl, Madeleine Bates, Michael Brown, William Fisher, Kate Hunicke-Smith, David Pallett, Christing Pao, Alexander Rudnicky, Elizabeth Shriberg, John Garofolo, Jonathan Fiscus, Denise Danielson, Enrico Bocchieri, Bruce Buntschuh, Beverly Schwartz, Sandra Peters, Robert Ingria, Robert Weide, Yuzong Chang, Eric Thayer, Lynette Hirschman, Joe Polifroni, Bruce Lund, Goh Kawai, Tom Kuhn, and Lew Norton |
| LDC Catalog No.: | LDC95S26 |
| NIST Catalog No.: | 17-4.2 through 17-5.1 |
| ISBN: | 1-58563-043-8 |
| Data Type: | speech |
| Sample Rate: | 16000 Hz |
| Sampling Format: | 1-channel pcm compressed |
| Data Source(s): | microphone speech |
| Project(s): | ATIS |
| Application(s): | speech recognition, spoken dialogue systems |
| Language(s): | English |
| Language ID(s): | ENG |
| Distribution: | 1 DVD |
| Member fee: | $0 for 1995 members |
| Non-member Fee: | US $1500.00 |
| Reduced-License Fee: | US $750.00 |
| Extra-Copy Fee: | US $200.00 |
| Non-member License: | yes |
| Readme File: | yes |
| Online documentation: | yes |
| Licensing Instructions: | Subscription Members, Standard Members, Non-Members |
| Citation: | Deborah A. Dahl, et al. 1995 ATIS3 Test Data Linguistic Data Consortium, Philadelphia |
|
| This set of discs contains a corpus of speech and natural language data
collected under the auspices of the
Advanced Research Projects Agency Spoken Language Systems (ARPA-SLS) technology development
program. The corpus, which contains data in the Air Travel Information Services (ATIS) domain, was
designed by the ARPA-SLS Multi-site Atis Data COllection Working (MADCOW) group and was
collected by five sites at locations across the U.S.:
- BBN Systems & Technologies, Cambridge, MA
- Carnegie Mellon University, Pittsburgh, PA
- MIT Laboratory for Computer Science, Boston, MA
- National Institute of Standards and Technology, Gaithersburg, MD
- SRI International, Menlo Park, CA
The corpora on this set of discs is part of the third phase of collection of
ATIS data (ATIS3) and comprises
the development test (NIST Speech Disc 17-4.2) and evaluation test material (NIST Speech Disc 17-5.1)
used in the December 1994 ARPA SLS Benchmark Tests. As in the previous ATIS corpora, the speech
contained in this corpus was elicited by presenting subjects with various hypothetical travel planning
scenarios to solve. The resulting spontaneous spoken queries were recorded as
the subjects interacted with partially or completely automated ATIS systems to solve the
scenarios. Note that the ATIS3 training data
is available on NIST Speech Discs 17-1.1 - 17-3.1.
The recorded speech has been transcribed and annotated with categorizations and
canonical reference answers. All of the utterances on these discs have been recorded using a
close-talking, noise-canceling head-mounted
Sennheiser microphone. For some subjects, secondary (noisier) microphone data was recorded
simultaneously as well.
These discs also contains the ATIS3 46 city/52 airport relational database, a revised Principles of
Interpretation and test implementation and scoring instructions as well as other general documentation.
The ATIS3 corpus has been verified, collated, documented and produced on CD-ROM by the National
Institute of Standards and Technology (NIST) in cooperation with MADCOW and distributed by the
Linguistic Data Consortium (LDC).
Content Copyright |
|
|