New Corpora

Russian telephone speech, transcripts and lexicon: CALLFRIEND Russian SpeechCALLFRIEND Russian Text: 48 hours of telephone conversations between native speakers, corresponding transcripts and a lexicon, developed by LDC for automatic language identification

English public safety communications problem solving activity: 2019 OpenSAT Public Safety Communications Simulation: 141 hours of speech with transcripts, speakers played the Flash Point Rescue board game with noisy headsets, sessions consisted of 2 30-minute games, data divided into training, dev and eval

Icelandic prompted speech: Samrómur Queries Icelandic Speech 1.0: 20 hours of Icelandic prompted queries from 3,809 speakers representing 17.475 utterances, developed by the Language and Voice Lab, Reykjavik University in cooperation with Almannarómur, Center for Language Technology  

Spanish telephone, read and elicited speech for speaker ID: Mixer 7 Spanish Speech: 9,600 hours or recordings, 191 native Spanish speakers, some data collected using a 14-microphone array, used in the 2012 NIST SRE test set 

Thai language resources for HLT development: LORELEI Thai Representative Language Pack: monolingual and parallel text with entity linking and detection annotation and situation frame analysis, developed by LDC for the DARPA LORELEI program