Language Archive Survey Results


1. Name and Location

Archive Name: LACITO Linguistic Text Archive
Archive URL: http://195.83.92.32
Host Institution: LACITO/CNRS (French National Center for Scientific Research)
Country: France
Contact Person: Boyd Michailovsky
Email Address: boydm@vjf.cnrs.fr


2. Catalog

2.1 If the archive has a catalog in a standardized format, what fields does it contain? If not, what contextual information about the resources are collected? What other information would you like to collect if you could?
Only the language name and a title for each text have been digitized (as XML header metadata), in order to make access by language possible. This is the extent to which the material is electronically catalogued at present. Date and p lace of recording and speaker name are of course known and will eventually be included in the digitized metadata.

2.2 If the electronic catalog conforms to some standard, please tell us the name of the standard.

2.3 To what extent have the archived materials been cataloged electronically?
virtually everything

2.4 If there is an online public access catalog, please give its URL.
http://195.83.92.32


3. Holdings

3.1 What geographical regions and languages are covered?
Main Regions Covered: Asia Oceania
Approx Number of Languages: 16
Main Languages: 14 lgs of New Caledonia, 2 Tibeto-Burman lgs of Nepal.

3.2 Please give impressionistic estimates of the archive holdings for each of the data types.
DATA TYPE NON-DIGITAL DIGITAL
Texts: large
Wordlists, Vocabularies, Lexicons, Dictionaries: small
Field Notes, Correspondence, Misc files: none
Descriptions (Grammars, Phonologies, etc): none
Audio Recordings: large
Video Recordings: small

3.3 Please list any other data types which are not included above, or any other comments on the archive holdings:
At present the archive is exclusively made up of time-aligned digitized sound/text documents (narrative, conversation, oral tradition). We are working on making digitized lexicons of the two Nepal languages available and accessible f rom the texts. LACITO researchers have large amounts of non-digitized or partially digitized material, published and unpublished, which may be included in the archive when digitized and time-aligned.

3.4 What proportion of the holdings are unique to the archive and not available elsewhere?
virtually everything


4. Electronic Publication

4.1 To what extent are the archive holdings published electronically, where "published" means that there is a well-defined procedure such that anyone at all can get a standard copy of the data, either on digital media or over the internet?
a small amount

4.2 To what extent are the archive holdings accessible over the web?
just some samples

4.3 Is permission required before materials can be accessed?
sometimes

4.4 Is there any fee for materials?
no

4.5 How are author and/or editor defined for the electronic publications? Is there a bibliographical citation method?
We are have not yet addressed these issues.

4.6 Do the electronic publications have ISBN numbers?
no

4.7 What plans are there to expand the electronic publication of archive holdings?
We are working on expanding and developing the existing archive, and making more of it public. We are not clear on the issues involved in "electronic publication" and are interested in others' solutions. Clearly, researchers would be encouraged to prepare materials for the archive if a standard for electronic publication of such documents were defined and recognized.


5. General Issues

5.1 Who is the legal owner of archived materials?
We have not studied this issue.

5.2 Beyond legal ownership, are there any asserted or perceived moral rights concerning archived materials? Do the holders of the archive see the original speakers or their representatives as controlling publication?

5.3 In cases where no electronic publication is planned, why is this so? (e.g. funding, licensing, technical know-how, lack of interest).

5.4 Is any of the data in a proprietary format (e.g. MS Word)? If so, are there plans to transfer it to an open standard (e.g., XML)?
All material is in XML.


6. Do you have any other comments about digital archives of language material, or on this survey?



Back to the index page