photograph of colorful LDC USBs 

US funding agencies such as the National Science Foundation (NSF) increasingly require researchers to deposit data in an accessible, trustworthy archive. The Consortium’s expertise in data curation, distribution and management and its commitment to the broad accessibility of linguistic data make it the repository of choice for NSF-funded data.

LDC administers data management plans by providing archiving services and making data publicly available at a reasonable cost while protecting intellectual property rights and accommodating privacy concerns. In addition, LDC has in place infrastructures and processes for reviewing, storing and distributing resources over the long-term, a key element for data management plans in general.

Data sets developed and/or distributed with NSF funding include Arabic Broadcast News Speech and Transcripts, Grassfields Bantu Fieldwork, Penn Discourse Treebank, Propbank, SLX Corpus of Classic Sociolinguistic Interviews, Subglottal Resonances Database, The Santa Barbara Corpus of Spoken American English (multiple parts), Translanguage English Database and Speech in Noisy Environments (SPINE) (multiple releases).

Learn more about how LDC can assist researchers in developing and implementing data management plans from our website, our data sheet, or contact LDC Data Management Plans.