Current Projects

LDC is involved in a number of projects that support language education, research and technology development.

AIDA (Active Interpretation of Disparate Alternatives) (DARPA)

AIDA’s goal is to develop a multi-hypothesis semantic engine that generates explicit alternative interpretations of events, situations and trends from a variety of unstructured sources. LDC supports AIDA by collecting, creating and annotating multimodal linguistic resources in multiple languages.

COVID-19 Research

Data developed by LDC in the DARPA LORELEI program is available under a special no-cost license for COVID-19 research.

KAIROS (Knowledge-directed Artificial Intelligence Reasoning Over Schemas) (DARPA)

KAIROS seeks to develop a schema-based AI system that can identify complex events and bring them to the attention of users. It aims to understand complex events described in multimedia inputs by developing a semi-automated system that identifies, links, and temporally sequences their subsidiary elements, the participants involved, as well as the complex event type. LDC supports KAIROS by collecting, creating and annotating linguistic resources in multiple languages.

LRE (Language Recognition Evaluation) (NIST)

LDC develops linguistic resources to support the NIST LRE series.

NIEUW (Novel Incentives and Workflows in Linguistic Data Collection and Annotation) (NSF)

NIEUW is an LDC project supported by an NSF CISE Research Infrastructure planning grant. The goal is to build a framework to develop multilingual language resources employing crowdsourcing techniques proven to work in multiple scientific disciplines. 

OpenMT (Machine Translation) (NIST)

LDC supports the NIST Open Machine Translation (OpenMT) Evaluation series by developing test sets in multiple languages and genres and by sharing linguistic resources developed in other programs including DARPA GALE and TIDES. The objective of the OpenMT evaluation series is to support research in machine translation (MT) technologies -- technologies that translate text between human languages -- and to advance the state of the art in the MT field. Input may include all forms of text. The goal is for the output to be an adequate and fluent translation of the original.

SRE (Speaker Recognition Evaluation) (NIST)

LDC develops linguistic resources to support the NIST Speaker Recognition Evaluation (SRE) series.