Annotation Tasks and Specifications | Linguistic Data Consortium

ACE 2008

ACE 2008 tasks included Local (within-document) EDR (Entity Detection and Recognition) and RDR (Relation Detection and Recognition) for English and Arabic. ACE 2008 also included a pilot task for Global (cross-document) EDR and RDR for English and Arabic.

ACE 2007

Evaluation

Tasks for ACE 2007 included a pilot evaluation using Spanish data for entity detection and recognition (EDR) and temporal expression recognition and normalization (TERN). Data selection was semi-automatic; a set of candidate documents was manually reviewed to select individual documents that were suitable for ACE annotation, for instance documents that were representative of their genre and contain targeted ACE entity types.

Spanish Entities V1.6

Entity Translation Pilot Evaluation

A pilot evaluation of Entity Translation was also conducted as part of ACE 2007. Systems participating in the pilot ET task were evaluated on their ability to take in a text document in one language (either Mandarin Chinese or Arabic) and emit an English language catalog of the entities mentioned in the document. LDC created reference translations and ACE annotations to support the ET pilot task with support from the REFLEX program. Data came from three languages -- English, Chinese and Arabic -- and from two genres, newswire and weblogs. LDC used manual data selection to maximize concentration of names in the selected files. Some attempt was made to incorporate files containing infrequent names instead of only targeting names of the most common newsmakers.

ACE 2005

In 2005 ACE expanded to include Event annotation for Arabic, English and Chinese. ACE 2005 included careful, targeted data selection.

Entities

Entity tagging is the core annotation task, providing the foundation for all remaining tasks. Seven types of entities are identified: Person, Organization, Location, Facility, Weapon, Vehicle and Geo-Political Entity (GPEs). Each type is further divided into subtypes (for instance, Person subtypes include Individual, Group and Indefinite). Annotators tag all mentions of each entity within a document, whether named, nominal or pronominal. For every mention, the annotator identifies the maximal extent of the string that represents the entity and labels the head of each mention. Nested mentions are also captured. Each entity is classified according to its type and subtype. Each entity mention is further tagged according to its class: specific, generic, attributive, negatively quantified or underspecified. Annotators also review the entire document to group mentions of the same entity together; they also label cases of metonymy, where the name of one entity is used to refer to another entity (or entities) related to it.

Values

A Value is a text string that further characterizes the properties of some Entity or Event. The value annotation task consists of simply identifying the text string containing the value mention. Annotators tag the following values: NUMERIC, CONTACT-INFO, TIMEX2, JOB-TITLE, CRIME and SENTENCE.

Relations

The goal of the Relation task is to detect and characterize Relations of the targeted Types between entities. Relations are ordered pairs of entities. Annotators label the type and subtype for each relation, along with its syntactic class and syntactic extent. Relations are also tagged for modality and tense. Finally, annotators timestamp relations that contain temporal expressions within their extent.

Events

Event annotation is limited to a constrained set of types and subtypes. For each atomic event mentioned in the text, annotators label the event's extent (the sentence containing the event) and its trigger. (the word that most clearly expresses the event's occurrence). Annotators further tag all of the participants (ACE entities) involved in that event. In addition, annotators label attributes (entities and values) that are part of the event but do not constitute event participants. Event participants and attributes taken together are known as event arguments. Finally, annotators label each event for polarity, tense, genericity and modality.

ACE 2004

ACE 2004 included Entities and Relations for English, Chinese and Arabic.

ACE 2003

ACE 2003 included Entities and Relations for Chinese, and Entities only for Arabic.

ACE Phase 2 (November 2002)

ACE Phase 2 included Entities and Relations for English

ACE Phase 1 (February 2002) and ACE Pilot

ACE Phase 1 and ACE Pilot included Entities for English.

Entity Detection and Tracking: Phase1 v2.2