Common task human language technology (HLT) evaluations address a particular research problem. Participants develop and test possible models following an evaluation plan. Training, development and test data are ingested at various points in the process. Training data trains models, and development data (also known as a cross-validation set) is used to determine the best performing model(s). Test data assesses a model’s performance. Outcomes are analyzed and scored.

Training data, development data and test data can be subsets of a single data set. Development data and test data may originate from a source other than the training set.    

For many HLT evaluation tasks, LDC partners with NIST’s Multimodal Information Group and Retrieval Group to provide training, development and test data for research areas that include speech recognition, language recognition, machine translation, cross-language retrieval and multimedia retrieval.

In collaboration with evaluation sponsors, the Consortium releases evaluation corpora through the Catalog. Many are turnkey packages that allow users to replicate the evaluation and consist of plans and specifications, training, development and test data, scoring software and answer keys.

Visit the Technology Evaluation pages for more information about evaluation data sets available from LDC.