Language Archive Survey Results


1. Name and Location

Archive Name: The Rosetta Project
Archive URL: http://www.rosettaproject.org
Host Institution: Long Now Foundation
Country: USA
Contact Person: Jim Mason
Email Address: jimmason@longnow.org


2. Catalog

2.1 If the archive has a catalog in a standardized format, what fields does it contain? If not, what contextual information about the resources are collected? What other information would you like to collect if you could?
Source (in full bibliographic format) will be collected soon; for the moment, only title and author are recorded, as well as of course language relevant.

2.2 If the electronic catalog conforms to some standard, please tell us the name of the standard.
14th ed. Ethnologue language names/codes

2.3 To what extent have the archived materials been cataloged electronically?
virtually everything

2.4 If there is an online public access catalog, please give its URL.
http://www.rosettaproject.org


3. Holdings

3.1 What geographical regions and languages are covered?
Main Regions Covered: Africa Americas Asia Europe Oceania
Approx Number of Languages: 1200
Main Languages: Too many to name (Degema was the first entry to be completed)

3.2 Please give impressionistic estimates of the archive holdings for each of the data types.
DATA TYPE NON-DIGITAL DIGITAL
Texts: small large
Wordlists, Vocabularies, Lexicons, Dictionaries: small large
Field Notes, Correspondence, Misc files: none small
Descriptions (Grammars, Phonologies, etc): small large
Audio Recordings: small none
Video Recordings: none none

3.3 Please list any other data types which are not included above, or any other comments on the archive holdings:
Precise up to date figures can be obtained on the Advanced Search page on the site.

3.4 What proportion of the holdings are unique to the archive and not available elsewhere?
a small amount


4. Electronic Publication

4.1 To what extent are the archive holdings published electronically, where "published" means that there is a well-defined procedure such that anyone at all can get a standard copy of the data, either on digital media or over the internet?
virtually everything

4.2 To what extent are the archive holdings accessible over the web?
virtually everything

4.3 Is permission required before materials can be accessed?
no

4.4 Is there any fee for materials?
no

4.5 How are author and/or editor defined for the electronic publications? Is there a bibliographical citation method?
There will be a full bibliographical citation form very soon; at the moment, only author and publication are recorded.

4.6 Do the electronic publications have ISBN numbers?
no

4.7 What plans are there to expand the electronic publication of archive holdings?
Everything in the archive will be made available online, copyright issues permitting; indeed, most of the archive will only be gathered together in that form.


5. General Issues

5.1 Who is the legal owner of archived materials?
The Long Now Foundation for original contributions, unless otherwise specified; the rest is excerpted under Fair Use provisions. All of it is and will remain freely available for non-commercial use.

5.2 Beyond legal ownership, are there any asserted or perceived moral rights concerning archived materials? Do the holders of the archive see the original speakers or their representatives as controlling publication?
The question has yet to arise, since we have no archives of pre-Internet recordings by native speakers as yet.

5.3 In cases where no electronic publication is planned, why is this so? (e.g. funding, licensing, technical know-how, lack of interest).
Copyright issues.

5.4 Is any of the data in a proprietary format (e.g. MS Word)? If so, are there plans to transfer it to an open standard (e.g., XML)?
Scanned materials are in BMP, and soon being transferred where possible to the proprietary format PDF; XML is a strong future possibility, but was ruled out for the present due to the rarity of browsers capable of handling it.


6. Do you have any other comments about digital archives of language material, or on this survey?



Back to the index page