This page describes archives which house materials that are intended to document and describe human languages, such as wordlists, lexicons, annotated signals, interlinear texts, paradigms, field notes, and linguistic descriptions. Please send updates to Steven Bird (sb@ldc.upenn.edu)
By Language Documentation and Description we have Himmelmann's definitions in mind:
The aim of a language documentation is to provide a comprehensive record of the linguistic practices characteristic of a given speech community. Linguistic practices and traditions are manifest in two ways: (1) the observable linguistic behavior, manifest in everyday interaction between members of the speech community, and (2) the native speakers' metalinguistic knowledge, manifest in their ability to provide interpretations and systematizations for linguistic units and events. This definition of the aim of a language documentation differs fundamentally from the aim of language descriptions: a language description aims at the record of A LANGUAGE, with "language" being understood as a system of abstract elements, constructions, and rules that constitute the invariant underlying structure of the utterances observable in a speech community. (page 166, `Documentary and descriptive linguistics', Nikolaus P. Himmelmann (1998). Linguistics 36. pp. 161-195. Berlin: de Gruyter.) |
Related index pages: Linguistic Exploration, Linguistic Annotation
Not listed? Details incorrect? Please complete our Survey of online language archives.
| ||||||||
| Linguistic Resources | |||||
|---|---|---|---|---|---|
| F |
Archive of Folk Culture
(folklife@loc.gov)
The Archive of Folk Culture is part of the Library of Congress, and houses over a million photographs, manuscripts, audio recordings, and moving images. It has many field recordings for American Indian languages. | ||||
| U |
Archive of Indigenous Languages of Latin America
(Joel Sherzer, Anthony Woodbury)
The AILLA project is creating an audio and text database, equipped with a web interface, to archive and disseminate materials from the indigenous languages of Latin America. The archive will store materials drawn from the full range of linguistic behavior - from phonetics to discourse, in the form of primary data and analyses - making accessible a very broad range of data on linguistic behavior. [ Lev Michael's Chicago talk ] | ||||
| P |
ALMA: African Language Material Archive
(Leigh Swigart)
The West African Research Association is planning a new repository of materials published in the last few decades in West African languages. These materials will probably include basic education materials, newspapers, poetry and literature, development materials (eg agricultural and health) as well as basic grammars and lexicons produced by West African linguists. WARA is collaborating with Michigan State University to digitize a large number of research materials on Africa, including sound data. A report about the planned archive is posted here. | ||||
| P |
ANLC: Alaska Native Language Center Archives
(Gary Holton)
The Alaska Native Language Center houses an archive of more than 10,000 items, covering virtually everything written in or about Alaska Native languages, including copies of most of the earliest linguistic documentation. The archive also contains significant collections about related languages outside Alaska. Much of the material consists of original field notes and unpublished manuscripts. An online index is under construction, and a proposal to digitize the entire archive is being prepared. Online resources for Iñupiaq and Yip'ik are available here. [survey response] | ||||
|
APS: American Philosophical Society American Indian Manuscript Collections
(Robert Cox)
The American Philosophical Society houses an extensive collection of historical materials (manuscripts, microfilms, audio) documenting some 200 American Indian languages. The archive includes wordlists and texts collected by Bloomfield, Sapir and Boas [online bibliography, survey response] | |||||
| F |
ASEDA: Aboriginal Studies Electronic Data Archive
(Patrick McConvell)
ASEDA has texts, dictionaries, grammars and teaching materials represting some 300 Australian Aboriginal languages. | ||||
|
Archives of Traditional Music
(atmusic@indiana.edu)
This is a large ethnographic sound archive based at Indiana University, and holds field recordings of music, folktales, interviews and oral histories. Selected recordings are available online. | |||||
| U |
CDEL: Center for the Documentation of Endangered Languages
(Douglas Parks, Wally Hooper)
CDEL archives sound recordings and restores historic recordings. Two of their dictionaries have online samples [see exploration:CDEL] | ||||
| P |
CoPAR: Council for the Preservation of Anthropological Records
(Don Fowler, Nancy Parezo)
CoPAR is based at Arizona State University, and its mission is to `identify, encourage the preservation, and foster the use of the records of anthropological research.' CoPAR organizes regular workshops on preservation and access. CoPAR plans a metadata index, called the National Guide to Anthropological Records, to record the locations of anthropological collections. The entries in this index will use CoPAR's Content Standard for Anthropological Metadata. | ||||
| F |
Creolist Archives
(Mikael Parkvall)
The Creolist Archives has text and speech collections which document a wide variety of creoles. The site also has sections on research material and online scanned versions of old books. | ||||
| P |
DOBES Archive
(Peter Wittenburg)
The Max Planck Institute in Nijmegen is developing a multimedia databank called DOBES, funded by the pilot phase of the VW Foundation's project on Documentation of Endangered Languages. [survey response] | ||||
| P |
E-MELD: Electronic Metastructure for Endangered Language Data
(Helen Aristar Dry)
LINGUIST is planning a comprehensive digital metadata archive which will serve as a global index for EL documentation and description. | ||||
| U |
LACITO Linguistic Data Archive
(Boyd Michailovsky)
The LACITO Archive contains 100+ texts in 16 languages (14 from New Caledonia and 2 from Nepal). A few of the texts are available online, along with interlinear transcriptions and aligned audio. An interesting feature of the archive is its use of XML [see exploration:LACITO]. [survey response] | ||||
| P |
LDC Language Documentation Archive
(Steven Bird)
The Linguistic Data Consortium is planning an archive and publications series (CD-ROM and Web) for lexicons, (interlinear) texts, and annotated audio recordings. This work is being pursued in the context of the HyperLex project. | ||||
|
LPCA: Language and Popular Culture in Africa Text Archives
(Vincent A. De Rooij)
Language and Popular Culture in Africa is an internet-based project which aims to document the expressions of popular language and culture in Africa. Most of the materials are texts and dialogues in Swahili. [survey response] | |||||
| U |
MPI Language Archive
(Peter Wittenburg)
The language archive of the Max Planck Institute, Nijmegen holds the texts, lexicons and audio and video recordings collected by its members. A private digital archive is under construction. [survey response] | ||||
|
NAA: National Anthropological Archives
(Robert Leopold)
The NAA collection at the Smithsonian Institution contains historical and contemporary materials that document the world's cultures, including manuscripts, fieldnotes, correspondence, photographs, maps, sound recordings, film and video (online catalog). The collection includes over 1,300 sound recordings of Native American myths, legends, stories and songs recorded by John Peabody Harrington and his associates for the Bureau of American Ethnology between 1912 and 1941. The NAA has all of Harrington's aluminum disk recordings, plus newly mastered reel-to-reel audiotapes from which duplicate cassettes are produced on demand. Robert Leopold maintains an excellent page of links to ethnographic archives and resources. [survey response] | |||||
|
OTA: Oxford Text Archive
(Martin Wynne)
The Oxford Text Archive contains thousands of texts, and also lexicons and notes, for some 25 languages [online catalog, survey response] | |||||
| F |
SAA: Speech Accent Archive
(Steven Weinberger)
This archive contains a short text, read by over 100 non-native speakers of English. Audio and IPA transcriptions are provided. [survey response] | ||||
|
SBALD: Santa Barbara Archive of Language and Discourse
(Jack Du Bois)
The Linguistics Department at UCSB has an archive of field recordings made by its members over a period of 30 years. The archive contains about 400 120-minute DAT tapes copied from the original analog recordings (mostly reel-to-reel). Holdings are strongest for North American indigenous languages, but there are many recordings from Central American and Asian languages, as well as other parts of the world. The recordings are cataloged and indexed in a relational database. | |||||
| P |
SIL-LCA: SIL Language and Culture Archive
(Joan Spanne)
The Summer Institute of Linguistics has a paper and microfiche archive of primary materials, and is planning an electronic archive. [survey response] | ||||
| F |
SIL Mexico Archive
(Albert Bickford)
The Mexico Branch of Summer Institute of Linguistics, has a website with descriptive works covering some 20 languages of Mexico (Seri, plus several varieties of Mixtec, Zapotec, Nahuatl), including a Series of Vocabularies and Dictionaries. [survey response] | ||||
| P |
Survey of California and Other Indian Languages
(Leanne Hinton)
This archive, based at the UCB Linguistics Department is a repository for linguistic fieldnotes and descriptions of Native American languages ( online catalog). The office manages a major collection of tapes of American Indian languages ( online catalog). F |
UHLCS: University of Helsinki Language Corpus Server
(Marko Pölönen)
| This archive, hosted by the Department of General Linguistics at the University of Helsinki, contains texts and wordlists for 60 Uralic, Turkic and Iranian languages, plus Swedish, English, German and Russian. [survey response] | ||
Others being investigated: http://pacling.anu.edu.au/ http://www.ewc.hawaii.edu/ http://www.soas.ac.uk/Archives/db.html http://pandora.nla.gov.au/pandora/ http://home.t-online.de/home/LINCOM.EUROPA/#Ling http://www.indiana.edu/~libarchm/ http://www.ucs.mun.ca/~culture/munfla.html http://cgi.portugues.mct.pt/acesso/ http://cdli.ucla.edu/ http://www.upenn.edu/museum/Collections/archives.htmlLast update: 9 April 2002