Introduction
This file contains documentation on the ACE Time Normalization (TERN) 2004 English Training Data v 1.0, Linguistic
Data Consortium (LDC) catalog number LDC2005T07 and ISBN 1-58563-331-3.
This release contains the English training data prepared for the
2004 Time Expression Recognition and Normalization (TERN) Evaluation,
sponsored by the Automatic Content Extraction (ACE) program. The evaluation
was held in August 2004 and a workshop in September 2004. Evaluation
participants received this data for training purposes, and it is now being
released for general use.
The annotation specifications for this corpus were developed under DARPA's
Translingual Information Detection Extraction and Summarization (TIDES)
program, with continuing support from ACE.
The purpose of this corpus and the TERN evaluation is to advance the
state of the art in the automatic recognition and normalization of
natural language temporal expressions. In most language contexts such
expressions are indexical. For example, with "Monday," "last week," or
"three months starting October 1," one must know the narrative reference
time in order to pinpoint the time interval being conveyed by the expression.
In addition, for data exchange purposes, it is essential that the identified
interval be rendered according to an established standard, i.e., normalized.
Accurate identification and normalization of temporal expressions is in turn
essential for the temporal reasoning being demanded by advanced NLP
applications such as question answering, information extraction, and
summarization.
Samples
Please examine this sample to see an example of the corpus.
Updates
Additional information, updates, bug fixes may be available in the LDC
catalog entry for this corpus at LDC2005T07.
Content Copyright
Portions
© 1998 Los Angeles Times-Washington Post News Service, Inc.
© 1998, 2000 American Broadcasting Corporation
© 1998, 2000 Cable News Network, Inc.
© 1998, 2000 Press Association, Inc.
© 1998, 2000 New York Times
© 1998, 2000 National Broadcasting Company, Inc.
©1998, 2000 Public Radio International
©2000 Xinhua News
© 2000 SPH AsiaOne Ltd.
© 2000 China National Radio
© 2000 China Television System
© 2000 China TV Program Agency
© 2000 China Broadcasting System
"The World" is a co-production of Public Radio International and the
British Broadcasting Corporation and is produced at WGBH Boston. |