Introduction
Message Understanding Conference (MUC) 7 was produced by Linguistic Data Consortium (LDC) catalog number LDC2001T02 and ISBN 1-58563-205-8.
In the 1990s, the MUC evaluations funded the development of metrics and statistical algorithms to support government evaluations of emerging information extraction technologies. Additional information from NIST can be found here.
Data
The following list shows the correspondence between versions of the IE
task definition and stages of the MUC-7 evaluation.
| Version # | Stage |
| 4.1 | training and dryrun |
| 4.2 | formalrun |
| 5.1 | final |
The dryrun and formalrun have different domains; the dryrun (and
training) consists of aircrashes scenarios and the formalrun consists of missile
launches scenarios. The final version updates especially the Template Relations
portion of the guidelines.
Normally, for each scenario, two datasets are provided: training and
test. When the evaluation cycle begins, the label for the scenario
dataset is training. Then the corresponding test dataset for that same
scenario is used for the dryrun testing. For the formal run, a formal
training set is given out four weeks before the test answers are due. The formal test is given out one week before the test answers are
due. After the entire evaluation and meeting have been held, final
edits are made if necessary.
Updates
August 22, 2001: This publication was inadvertently released without the
guidelines documentation and the scoring software. These documents and programs
have now been added to the publication and if you previously purchased this
corpus and would like to download a complete copy of the corpus
please contact
ldc@ldc.upenn.edu.
Copyright
Portions © 1996 New York Times, © 2001 Trustees of the University of Pennsylvania |