Annotation Tasks and Specifications
ACE 2008 tasks included Local (within-document) EDR (Entity Detection and Recognition) and RDR (Relation Detection and Recognition) for English and Arabic. ACE 2008 also included a pilot task for Global (cross-document) EDR and RDR for English and Arabic.
- English Entities V6.6
- English Relations V6.2
- Arabic Entities V7.4.2
- Arabic Relations V6.5
- Cross-Document Coreference
Tasks for ACE 2007 included a pilot evaluation using Spanish data for entity detection and recognition (EDR) and temporal expression recognition and normalization (TERN). Data selection was semi-automatic; a set of candidate documents was manually reviewed to select individual documents that were suitable for ACE annotation, for instance documents that were representative of their genre and contain targeted ACE entity types.
Entity Translation Pilot Evaluation
A pilot evaluation of Entity Translation was also conducted as part of ACE 2007. Systems participating in the pilot ET task were evaluated on their ability to take in a text document in one language (either Mandarin Chinese or Arabic) and emit an English language catalog of the entities mentioned in the document. LDC created reference translations and ACE annotations to support the ET pilot task with support from the REFLEX program. Data came from three languages -- English, Chinese and Arabic -- and from two genres, newswire and weblogs. LDC used manual data selection to maximize concentration of names in the selected files. Some attempt was made to incorporate files containing infrequent names instead of only targeting names of the most common newsmakers.
In 2005 ACE expanded to include Event annotation for Arabic, English and Chinese. ACE 2005 included careful, targeted data selection.
Entity tagging is the core annotation task, providing the foundation for all remaining tasks. Seven types of entities are identified: Person, Organization, Location, Facility, Weapon, Vehicle and Geo-Political Entity (GPEs). Each type is further divided into subtypes (for instance, Person subtypes include Individual, Group and Indefinite). Annotators tag all mentions of each entity within a document, whether named, nominal or pronominal. For every mention, the annotator identifies the maximal extent of the string that represents the entity and labels the head of each mention. Nested mentions are also captured. Each entity is classified according to its type and subtype. Each entity mention is further tagged according to its class: specific, generic, attributive, negatively quantified or underspecified. Annotators also review the entire document to group mentions of the same entity together; they also label cases of metonymy, where the name of one entity is used to refer to another entity (or entities) related to it.
A Value is a text string that further characterizes the properties of some Entity or Event. The value annotation task consists of simply identifying the text string containing the value mention. Annotators tag the following values: NUMERIC, CONTACT-INFO, TIMEX2, JOB-TITLE, CRIME and SENTENCE.
- English Values V1.2.4
- Chinese Values V1.1.2
- Arabic Values V1.2.3
- English TIMEX2 Summary V0.1
- Chinese TIMEX2 Summary V1.2
The goal of the Relation task is to detect and characterize Relations of the targeted Types between entities. Relations are ordered pairs of entities. Annotators label the type and subtype for each relation, along with its syntactic class and syntactic extent. Relations are also tagged for modality and tense. Finally, annotators timestamp relations that contain temporal expressions within their extent.
- English Relations V5.8.3
- Chinese Relations V5.5.1
- Arabic Relations V5.3.4
- English Timestamping V3
- Chinese Timestamping V2
Event annotation is limited to a constrained set of types and subtypes. For each atomic event mentioned in the text, annotators label the event's extent (the sentence containing the event) and its trigger. (the word that most clearly expresses the event's occurrence). Annotators further tag all of the participants (ACE entities) involved in that event. In addition, annotators label attributes (entities and values) that are part of the event but do not constitute event participants. Event participants and attributes taken together are known as event arguments. Finally, annotators label each event for polarity, tense, genericity and modality.
ACE 2004 included Entities and Relations for English, Chinese and Arabic.
- English Entity Guidelines V4.2.6
- English Linking Guidelines V3.0
- English Relations Guidelines V4.3.2
- Chinese Entity Guidelines V4.2.4
- Chinese Linking Guidelines V2.0
- Chinese Relations Guidelines V4.3
- Arabic Entity Guidelines V4.2.3
- Arabic Linking Guidelines V1.0
- Arabic Relations Guidelines V4.3
ACE 2003 included Entities and Relations for Chinese, and Entities only for Arabic.
ACE Phase 2 (November 2002)
ACE Phase 2 included Entities and Relations for English.
ACE Phase 1 (February 2002) and ACE Pilot
ACE Phase 1 and ACE Pilot included Entities for English.