PARTICIPANTS


ORGANIZERS

Steven Bird, Philadelphia, Pennsylvania, USA
University of Pennsylvania
Email: sb@ldc.upenn.edu
Web: http://www.ldc.upenn.edu/sb
Data Types: Metadata, Word list, Lexicon, Annotated signal, Writing system, Interlinear text, Paradigm, Field notes, Description, Common
Functions: Store, Create, Convert, Display, Query
For the last 15 years I have been developing data models and software tools in support of linguistic research. From 1995-97 I worked in Cameroon, documenting and analyzing the tone systems of the Grassfields Bantu languages, and developing computational methods to support the work. An online example is available at http://www.ldc.upenn.edu/sb/fieldwork/. I am associate director of the Linguistic Data Consortium, and I am a principal investigator on two NSF projects (Linguistic Exploration, TalkBank) providing computational infrastructure for empirical work in linguistics and in the social sciences more generally.

Gary Simons, Dallas, Texas, USA
SIL International
Email: gary_simons@sil.org
Web: http://www.sil.org/SIL/roster/simons.htm
Data Types: Metadata, Word list, Lexicon, Annotated signal, Writing system, Interlinear text, Paradigm, Field notes, Description, Common
Functions: Store, Create, Convert, Display, Query
For fifteen years (1984-1999) as Director of Academic Computing for SIL International I was involved in directing a number of projects that developed software to assist field linguists in documenting and describing language (IT, CELLAR, LinguaLinks, FieldWorks). In my current position as Associate VP for Academic Affairs I'm still involved in oversight of this area as well as of our efforts to launch an on-line language archive. During the development of the Text Encoding Initiative's guidelines for text markup, I was involved as a member of the Committee on Text Analysis and Interpretation and of the Technical Review Committee.


PRESENTERS AND PANELISTS

Helen Aguera, Washington, DC, USA
National Endowment for the Humanities
Email: HAguera@neh.gov
Web: http://www.neh.gov
Data Types: Metadata, Word list, Lexicon, Interlinear text, Field notes, Description
Functions: Store, Create, Convert, Display, Query
I am a program officer in the NEH Division of Preservation and Access. The Division's programs supports various types of projects related to language documentation and description. Awards fund the following actitivies: preparation of dictionaries, grammars, and corpora; reformatting of archival materials in order to preserve them; establishement of intellectual access to materials in archives and special collections; research and demonstration projects to develop standards and best practices and to enchance the use of digital technology toprovide access to humanities resources.

Eric Albright, Duncanville, Texas, USA
SIL International
Email: eric_albright@sil.org
Data Types: Writing system
Functions: Store
I am currently involved in Thesis research pertaining to the description of writing systems. I have been envolved in structured markup (SGML, XML) technology for about 5 years.

Jonathan Amith, Dallas, Oregon, USA
Yale University
Email: jonathan.amith@yale.edu
Data Types: Lexicon, Annotated signal, Interlinear text, Paradigm, Field note, Description
Functions: Store, Display, Query
The elaboration of a trilingual lexicon of Nahuatl (to Spanish and English) including detailed fields (among others) on grammatic function, morphology, semantic field, inflectional patterns, and roots. The writing of a pedagogically oriented reference grammar for modern Nahuatl, the electronic version of which will be linked to the lexicon. Creation of learning exercises and texts for use in the classroom and that will be made interactive and placed online. Elaboration of a corpus of narrative material including songs, prayers, life histories, conversation, etc. An interest in having the ability to link lexicon, grammar, and corpus online to facilitate rapid movement from one "section" to another.

Anthony Aristar, Ann Arbor, Michigan, USA
Eastern Michigan University
Email: aristar@linguistlist.org
Data Types: Metadata
Functions: Store
I am a co-moderater of The LINGUIST List, which is configuring a database to store linguistics-related metadata. The database will also store a limited amount of endangered languages data, but the focus of our activities is collecting and providing metadata to the discipline.

Helen Aristar-Dry, Ann Arbor, Michigan, USA
Eastern Michigan University
Email: hdry@linguistlist.org
Data Types: Metadata
Functions: Store
I am a co-moderater of The LINGUIST List, which is configuring a database to store linguistics-related metadata. The database will also store a limited amount of endangered languages data, but the focus of our activities is collecting and providing metadata to the discipline.

Neal Audenaert, College Station, Texas, USA
Texas A&M University
Email: neala@tamu.edu
Data Types: Lexicon, Common
Functions: Store, Create, Query
I am currently the primary investigator for a software development project attempting to create a system to support archiving and analyzing linguistic data. This system, the Language Data Repository, will support distributed access to linguistic data over a network, be that a local intranet, or the Internet, and host third party tools to support data analysis.

John Bell, Philadelphia, Pennsylvania, USA
University of Pennsylvania
Email: jmbell@babel.ling.upenn.edu
Data Types: Lexicon
Functions: Store
We have made a study involving approximately 55 dictionaries, investigating what sort of formats occur in them. From this we have begun to develop a general model of lexical entries, as well as trying to find what elements are universal in lexical entries in dictionaries. We have created a number of sample entries in XML format.

Daan Broeder, Nijmegen, The Netherlands
Max Planck Instute for Psycholinguistic
Email: Daan.Broeder@mpi.nl
Data Types: Metadata, Word list, Lexicon, Annotated signal, Interlinear text, Field notes, Description
Functions: Store, Create, Convert, Display, Query
Designing schema and implementing tools with respect to the description of language resources for several projects especially within the ISLE/EAGLES project. Influential participation in the TIDEL development team for the DOBES project and for anthropological and linguistical multi-media resources at the MPI.

Jean Carletta, Edinburgh, Scotland
Language Technology Group and Human Communication Research Centre, University of Edinburgh
Email: jeanc@cogsci.ed.ac.uk
Web: http://www.cogsci.ed.ac.uk/~jeanc
Data Types: Annotated signal, Interlinear text, Paradigm, Field notes, Description
Functions: Store, Create, Convert, Display, Query
The Language Technology Group has been developing technologies which support language corpora annotated at multiple, non-hierarchically structured levels, using XML/XSL with stand-off annotation. We have particular interests in technologies for hand and automatic data annotation and are currently beginning to explore ways of improving support for working with data by making it possible to configure coding, analysis, and display interfaces using graphical user interfaces.

Peter Constable, Dallas, Texas, USA
SIL International
Email: peter_constable@sil.org
Web: http://www.sil.org
Data Types: Metadata, Writing system, Linguistic description
Functions: Store, Create, Convert, Display
For the past few years, I have been working as part of a team within SIL conducting research to develop solutions for working with non- Roman writing systems on computers. Overall, this effort has been looking at issues of character encoding, encoding conversion, keyboard input, complex-script rendering, and fonts. In terms of software platforms, the focus has been on solutions for Microsoft Windows, but also for the Mac OS. Areas of particular focus for me have included the Unicode standard and, more recently, language identifiers.

Sharon Correll, Dallas, Texas, USA
SIL International
Email: Sharon_Correll@sil.org
Data Types: Word list, Lexicon, Writing system, Interlinear text, Field notes, Description, Common
Functions: Display
I'm involved in developing a text-rendering system called Graphite that can be used for complex scripts. Graphite is programmable, so it can be extended to handle varieties of writing system that are not handled by system software (such as Uniscribe).

Megan Crowhurst, Austin, Texas, USA
University of Texas at Austin
Email: mcrowhurst@mail.utexas.edu
Web: http://uts.cc.utexas.edu/~crowhurs/index.html
Data Types: Word list, Interlinear text, Field notes, Description
Functions: Store, Display, Query
I have been engaged in field research on Tupi-Guarani languages spoken in Bolivia, especially Guarayu (a language spoken by approximately 7000-8000 people in eastern Bolivia) since 1996. At present, I am serving as the Chair of the Linguistic Society of America's Committee on Endangered Languages and their Preservation.

Dafydd Gibbon, Bielefeld, Germany
Universität Bielefeld
Email: gibbon@spectrum.uni-bielefeld.de
Web: http://coral.lili.uni-bielefeld.de/~gibbon/ Data Types: Metadata, Word list, Lexicon, Annotated signal, Paradigm, Field notes, Description
Functions: Store, Create, Convert, Display, Query
My special interests with regards to language description are varied, but focus on the modelling of prosody, particularly in West African tone languages, and on computational lexicography for spoken language, with applications both to language documentation and speech technology. In the language documentation domain, I am just finishing a 4 year cooperation project with Université de Cocody, Abidjan, Côte d'Ivoire (1997-2000), on designing an encyclopedia for Ivorian languages, and have just started a 1 year pilot project "Ega: a documantation model for an endangered Ivorian language" within the DOBES consortium (Dokumentation bedrohter Sprachen - Documentation ofendangered languages) funded by the Volkswagen foundation, also in cooperation with the Université de Cocody, in which my project partners are Dr Firmin Ahoua, Cocody, and Dr Bruce Connell Oxford. Currently I am also working with Dr Eno-Abasi Urua of University of Uyo, Nigeria, on the prosody of languages of the Cross River region of Nigeria.

Jeff Good, Berkeley, California, USA
University of California, Berkeley/CBOLD
Email: jcgood@socrates.berkeley.edu
Data Types: Metadata, Word list, Lexicon, Common
Functions: Store, Create, Convert, Display, Query
Working at the Comparative Bantu Online Dictionary (CBOLD), I am involved in preparing data sources for online use and in working out the problems of how to turn a set of dictionaries and word lists into a comparative database--i.e., linking entries to historical reconstructions and linking together cognates across our sources.

Susan Hockey, London, UK
School of Library, Archive and Information Studies, UCL
Email: s.hockey@ucl.ac.uk
Web: http://www.ucl.ac.uk/slais/staff/shockey.html
Data Types: Metadata, Word list, Writing system, Description, Common
Functions: Store, Create, Display, Query
In a career in humanities computing that is now longer than I dare to admit, I have been involved in text analysis software development (COCOA and OCP), tools for the display of non-standard characters (Greek, Arabic, Hebrew), the Text Encoding Initiative (as a Member of the Steering Committee and of the Text Representation Committee), metadata research (cataloging electronic texts at CETH (Rutgers and Princeton), and the development of the Dublin Core), and electronic publishing of a text base with SGML encoding of literary interpretation (the Orlando Project at the Universities of Alberta and Guelph). I am now a Professor of Library and Information Studies teaching and researching in humanities information management. I am interested in exploring how tools and techniques from humanities computing can be brought together with established techniques and standards in library and archive studies to develop Internet-based electronic resources to serve the needs of humanities scholarship and teaching.

Gary Holton, Fairbanks, Alaska, USA
Alaska Native Language Center
Email: gary.holton@uaf.edu
Web: http://www.uaf.edu/anlc
Data Types: Metadata, Word list, Lexicon, Field notes
Functions: Store, Create, Display
I am a descriptive linguist with field work experience in Alaska and eastern Indonesia. I am interested in using the internet to make archival documentation materials on Alaska Native languages available in digital form and to permit input of data by other field workers.

Nancy Ide, Poughkeepsie, New York, USA
Vassar College
Email: ide@cs.vassar.edu
Web: http://www.cs.vassar.edu/~ide
Data Types: Metadata, Lexicon
Functions: Store, Create, Convert, Query
Development of EAGLES standard encoding and annotation formats for linguistic corpora in XML (XCES), in particular for morpho-syntactic encoding, parallel alignment, computational lexicons, and syntactic annotation. Development of a standard general annotation formalism and framework to support an EAGLES annotation repository.

Mark Liberman, Philadelphia, Pennsylvania, USA
University of Pennsylvania
Email: myl@cis.upenn.edu
Web: http://www.ling.upenn.edu/~myl
Data Types: Metadata, Lexicon, Annotated signal, Interlinear text, Paradigm, Field notes
Functions: Store, Create, Convert, Display, Query
See http://www.ling.upenn.edu/~myl for description of activities.

Saturnino Luz, Odense, Denmark
University of Southern Denmark
Email: luzs@acm.org
Web: http://tec.ccl.umist.ac.uk/
Data Types:Metadata
Functions: Store
I've collaborated in the TEC (Translational English Corpus) defining an architecture for access to the corpus over the internet and implementing most of the client-server software.

Kazuaki Maeda, Philadelphia, Pennsylvania, USA
LDC, University of Pennsylvania
Email: maeda@ldc.upenn.edu
Data Types: Metadata, Word list, Annotated signal, Interlinear text
Functions: Store, Create, Convert, Display
I am a programmer working with Steven Bird at LDC. I am also a graduate student at Penn working in the areas of speech technologies and phonetics research.

Anne Mahoney, Medford, Massachusetts, USA
Perseus Project. Tufts University
Email: amahoney@perseus.tufts.edu
Web: http://www.perseus.tufts.edu
Data Types: Metadata, Lexicon, Paradigm
Functions: Store, Create, Convert, Display, Query
Our digital library includes lexica, grammars, and morphological analyzers for the languages we deal with. We work on automatic information extraction (automated markup), and are starting to work on cross-language information retrieval.

Elena Maslova, San Francisco, California, USA
University of Bielefeld
Email: Maslova@jps.net
Data Types: Metadata, Lexicon, Annotated signal, Interlinear text, Description
Functions: Store, Convert, Display
A system for linguistic analysis of text data (Paradox, ObjectPal), intended primarily for agglutinative languages (in St.Petersbug Institute of Linguistic Research, Paleo-Siberian Department, in cooperation with Eugene Levin). A general "Language Description System", intended as a framework for computer-based descriptive grammars (HyperCard, a project in University of Bielefeld, lead by Christian Lehmann). A system for analysis and transcription of acoustic records (Paradox, Delphi, in cooperation with Eugene Levin). A system for analysis, representation and indexation of texts (Java, University of Leiden). A descriptive grammar of Yukaghir.

Mike Maxwell, Waxhaw, North Carolina, USA
SIL
Email: Mike_Maxwell@sil.org
Data Types: Word list, Lexicon, Annotated signal, Writing system, Interlinear text, Paradigm, Description
Functions: Create, Display, Query
I work in the areas of computational morphology and phonology. WRT the conference, my interests are in methods for rapid documentation of phonological, grammatical and lexical analyses of languages, and in making those analyses and the intermediate data used to create them permanently available in searchable electronic form.

Patrick McConvell, Canberra, Australia
Australian Institute of Aboriginal and Torres Strait Islander Studies
Email: patrick@aiatsis.gov.au
Web: http://www.aiatsis.gov.au
Data Types: Word list, Lexicon, Interlinear text
Functions: Store, Convert, Query
I have been appointed to the position of Research Fellow, Language and Society at AIATSIS, Canberra, this year. I have a background in anthropology and linguistics and have worked on description of Australian indigenous languages, and in bilingual education, language maintenance intervention and in training of indigenous language workers. My own current research is mainly related to language shift and maintenance, and language, culture and history, but I also deal with issues of documentation on behalf of the research section of AIATSIS, especially the needs of indigenous clients and their organisations. AIATSIS, in addition to holding a large print and audio-visual collection has developed a digital archive, ASEDA, mainly devoted to Australian indigenous language documentation.

Anthony McEnery, Lancaster, UK
Lancaster University
Email: mcenery@comp.lancs.ac.uk
Web: http://www.ling.lancs.ac.uk/staff/tony/tony.htm
Data Types: Writing system
Functions: Convert
Largely concerned with harmonising multiple representations of South Asian writing systems into UNICODE. Especially interested in doing that within the context of the General Architecture for Text Engineering (GATE).

Lev Michael, Austin, Texas, USA
University of Texas at Austin
Email: lmichael@mail.utexas.edu
Web: http://uts.cc.utexas.edu/~ailla
Data types: Metadata, Interlinear text, Field notes, Linguistic description
Functions: Create, Display, Query
As a member of the team that is developing the Archive of the Indigenous Languages of Latin America (AILLA), I am involved in ethnographic and discourse typological research, and the implementation of the resulting taxonomy in the archive's metadata structures, and in user-friendly search-interfaces. I am alsoconducting ongoing research on Nanti, an Arawakan language with roughly 500 speakers in southeastern Peru. I am presently concentrating on basic descriptions of Nanti phonology, prosody, and morphology, and on documentation of everyday discourse.

Boyd Michailovsky, Villejuif, France
LACITO, CNRS, France
Email: Boyd.Michailovsky@vjf.cnrs.fr
Data Types: Metadata, Lexicon, Annotated signal, Writing system, Interlinear text, Paradigm, Description
Functions: Store, Create, Convert, Display, Query
I am a general linguist working mainly on Tibeto-Burman languages of the Himalayan region and using structured text data formats for texts, lexicons, and comparative phonological data. Recently I have participated in the design of a research-oriented archive of time-aligned speech on the web (http://lacito.vjf.cnrs.fr/archivage.htm); about 90 minutes of texts in Hayu and Limbu (Tibeto-Burman languages of Nepal) are currently available for browsing and limited query.

Michael Nelson, Hampton, Virginia, USA
NASA LaRC & University of North Carolina
Email: mln@ils.unc.edu
Web: http://www.ils.unc.edu/~mln/
Data Types: Metadata
Functions: Store, Create, Display
Digital libraries: Open Archive Initiative (OAI) and the Smart Object, Dumb Archive (SODA) model.

Nicholas Ostler, Bath, England
Foundation for Endangered Languages
Email: nostler@chibcha.demon.co.uk
Web: http://www.ogmios.org, http://www.chibcha.demon.co.uk
Data Types: Lexicon, Interlinear text, Description, Common
Functions: Create, Query
As President of FEL and editor of its newsletter Ogmios, I encourage submission of proposals for documentation work relevant to endangered languages, and give coverage to work that is current. I am also currently documenting the south-american languages, Muisca (Chibcha) and Uwa (Tunebo). As principal consultant at Linguacubun, I work on European-, US- & UK-funded projects in language description (especially for machine translation).

Martha Palmer, Philadelphia, PA, USA
University of Pennsylvania
Email: mpalmer@cis.upenn.edu
Web: http://www.cis.upenn.edu/~mpalmer
Data Types: Metadadta, Word list, Lexicon
Functions: Store, Create, Convert, Display, Query
American Coordinator for ISLE, International Standards for Language Engineering (joint NSF/EU project) PI for the Penn Chinese Treebank: 100K of Chinese words annotated for segmentation, pos tagging, and syntactic bracketing

Bill Poser
Email: Bill_Poser@telus.net

Joel Sherzer, Austin, Texas, USA
University of Texas
Email: jsherzer@mail.utexas.edu
Data Types: Metadata, Interlinear text
Functions: Create, Display, Query
I have been carrying out research among the Kuna Indians of Panama since 1970. This involves the archiving and documentation of forms of discourse. Recently I have been involved with the AILLA project to archive forms of Latin American indigenous discourse on the web.

Ronald Sprouse, Berkeley, California, USA
UC Berkeley
Email: ronald@uclink.berkeley.edu
Data Types: Metadata, Word list, Lexicon, Interlinear text
Functions: Store, Create, Display, Query
I have worked for several years as Technical Director of the Ingush project at UC Berkeley, in which we are compiling a dictionary, grammar, and set of interlinearized texts. I have developed group collaboration software for collecting and annotating texts, as well as a basic markup standard for these texts. In addition, I have worked with the Comparative Bantu On-Line Dictionary project at UC Berkeley, in which we are attempting to provide a uniform query mechanism to a set of heterogeneous data sources, particularly word lists and lexicons, from a large set of Bantu languages.

Richmond Thomason, Ann Arbor, Michigan, USA
University of Michigan
Email: rich@thomason.org
Web: http://www.eecs.umich.edu/~rthomaso/
Data Types: Lexicon
Functions: Convert, Query
Reseach Interests in Artificial Intelligence, especially in Computational Linguistics and in Artificial Intelligence. I have assisted in the development of lexical materials for Montana Salish.

Steve Tinney, Philadelphia, Pennsylvania, USA
AMES, University of Pennsylvania
Email: stinney@sas.upenn.edu
Data Types: Metadata, Word list, Lexicon, Writing system, Interlinear text
Functions: Store, Create, Convert, Display, Query
Linguistic, literary and cultural research on Sumerian and Mesopotamia. Co-Director, Pennsylvania Sumerian Dictionary Project. I am heavily involved in computerizing the PSD, developing an XML framework for integration of primary texts, tools like signlists and the lexicon.

David Weber, Westmoreland, NY, USA
SIL International
Email: david_weber@sil.org
Data Types: Lexicon, Writing system, Interlinear text, Field notes
Functions: Store, Create, Convert, Display
I have written a grammar of Huallaga (Huanuco) Quechua, coauthored one on Bora (a Witotoa language spoken in Peru and Colombia), and am currently coauthoring on a grammar of Arabela (the last Zaparoan language, spoken by fewer than 100 people in northern Peru). I coauthored a dictionary of Huallaga Quechua (with equivalents, translations and indices in both Spanish and English).

Steven Weinberger, Fairfax, Virginia, USA
George Mason University
Email: weinberg@gmu.edu
Web: http://mason.gmu.edu/~weinberg
Data Types: Metadata, Annotated signal, Linguistic description
Functions: Display, Query
Steven Weinberger is chief investigator and administrator of the speech accent archives (http://classweb.gmu.edu/accent), a repository of non- native english speech and native dialects of english.

Douglas Whalen, New Haven, Connecticut, USA
Endangered Language Fund Email: whalen@haskins.yale.edu
Web: http://macserver.haskins.yale.edu/haskins/STAFF/whalen.html
Data Types: Metadata, Word list, Interlinear text
Functions: Store, Create, Convert, Display, Query
The Endangered Language Fund is beginning an online collection of material in the Algonquian languages. The first page (on Maliseet) has some initial material (http://www.ling.yale.edu/~elf/maliseet1.html). Our hope is to make searching for reflexes of Algonquian roots easy.

Peter Wittenburg, Nijmegen, The Netherlands
Max Planck Institute for Psycholinguistics
Email: Peter.Wittenburg@mpi.nl
Web: http://www.mpi.nl
Data Types: Metadata, Word list, Lexicon, Annotated signal, Interlinear text, Field notes, Description
Functions: Store, Create, Convert, Display, Query
Head of the development and archiving activities for multimedia language resources at the MPI. Responsible person for the DOBES project for Documenting Endangered Languages. Participation in the EAGLES/ISLE project for defining a proposal for Meta Descriptions for Multimedia Language Resources.


OTHER PARTICIPANTS (Latest Update)

John Alderete, Swarthmore, Pennsylvania, USA
Swarthmore College
Email: jaldere1@swarthmore.edu
Web: http://www.swarthmore.edu/SocSci/jaldere1/ling_jaldere1.html
Data Types: Word list, Lexicon, Annotated signal, Interlinear text, Paradigm, Field notes, Description
Functions: Store, Create, Convert, Display, Query
I work on the northern Athabaskan language Tahltan. I'm doing primary linguistic description of the sound inventory, morpho-phonemics, and the structure of verb words. In collaboration with Tanya Bob and Patricia Shaw at the University of British Columbia, I'm involved in a variety of corpus construction activities, including transcribing large sound files with fieldnotes and texts, and the construction of a lexical database.

Eva Banik, Philadelphia, Pennsylvania, USA
Linguistic Data Consortium
Email: ebanik@babel.ling.upenn.edu
Web: http://www.mentha.hu/~vica
Data Types: Metadata
Functions: Query
Developing data and service providers for language data.

Robert Beard, Lewisburg, Pennsylvania, USA
Bucknell University/yourDictionary.com
Email: rbeard@yourdictionary.com
Web: http://www.yourdictionary.com
Data Types: Word list, Lexicon
Functions: Store, Create, Convert, Display, Query
I have just launched an e-business called "yourDictionary.com" which will soon start publishing grammars and dictionaries on-line. One of our initiatives will be our "Endangered Language Repository" which will offer free webhosting for grammars and dictionaries of endangered languages.

John Albert Bickford, Catalina, Arizona, USA
SIL-Mexico
Email: albert_bickford@sil.org
Data Types: Metadata, Lexicon, Writing system, Interlinear text, Field notes, Description
Functions: Store, Create, Convert, Display, Query
Linguistics editor for SIL-Mexico website and webmaster for NDSIL website, both of which are a venue for publication of linguistic materials.

Jack Cain, Toronto, Canada
Multilingual E-Data Solutions (Multedata)
Email: jcain@multedata.ca
Web: http://www.multedata.ca
Data Types: Word list, Lexicon, Writing system
Functions: Store, Create, Convert, Display, Query
Our small consulting firm has been active for two years in assisting the new Canadian Arctic territory of Nunavut in implementing Canadian Aboriginal syllabics. A web-based collaborative but administered Inuktitut "Living Dictionary" is one of main products we have designed.

Nicoletta Calzolari Zamorani, Ghezzano, Italy
Istituto di Linguistica Computazionale - CNR
Email: glottolo@ilc.pi.cnr.it
Web: http://www.ilc.pi.cnr.it
Data Types: Metadata, Lexicon, Interlinear text, Description
Functions: Store, Create, Convert, Display, Query
European Responsible of the ISLE/EAGLES Computational Lexicon Working Group for standardiza tion of multilingual lexicons. Previous responsibility in standardization of morphosyntax, syntax (subcategorization) and semantics in computational lexicons. Design and creation of large computational lexicons with morphological, syntactic and semantic information. Annotation of text corpora at various levels of linguistic description.

Khalid Choukri, Paris, France
ELRA/ELDA
Email: choukri@elda.fr
Web: http://www.elda.fr
Data Types: Metadata, Word list, Lexicon
Functions: Store, Create, Convert
Managing director of the European Langage Resources Distribution Agency (ELDA)

Christopher Cieri, Philadelphia, Pennsylvania, USA
Linguistic Data Consortium
Email: ccieri@ldc.upenn.edu
Web: http://www.ldc.upenn.edu
Data Types: Metadata, Word list, lexicon, Annotated signal, Writing system, Interlinear text, Paradigm, Field notes, Description, Common
Functions: Store, Create, Convert, Display, Query
Executive Director of the Linguistic Data Consortium, presented at the first Linguistic Exploration workshop on New Methods for Creating, Exploring and Disseminating Linguistic Field Data, continuing work on the description of regional varieties of Italian.

Robert Cox, Philadelphia, Pennsylvania, USA
American Philosophical Society
Email: rscox@amphilsoc.org
Data Types: Metadata, Field notes, Description
Functions: Store, Create, Convert, Display, Query
Practicing archivist with oversight over a large collection of Native American sound recordings and other language data.

Sean Crist, Landsowne, Pennsylvania, USA
University of Pennsylvania
Email: kurisuto@unagi.cis.upenn.edu
Web: http://www.ling.upenn.edu/~kurisuto
Data Types: Metadata, Word list, Lexicon, Writing system, Interlinear text, Paradigm, Description
Functions: Store, Create, Convert, Display, Query
I'm working on creating online materials on the early Germanic languages (Gothic, Old English, Old Icelandic, and Old High German; see http://www.ling.upenn.edu/~kurisuto/germanic/language_resources.html). My long-term goal is to create a comprehensive comparative database of these languages and to experiment with automated language reconstruction techniques.

Michael Dukes, Stanford, California, USA
Stanford University & University of Canterbury
Email: mdukes@stanford.edu
Data Types: Metadata, Word list, Lexicon, Annotated signal, Writing system, Interlinear text, Paradigm, Field notes, Description, Common
Functions: Store, Create, Convert, Display, Query
My main areas of interest are in Austronesian linguistics, focussing on morphosyntax. I am particularly interested at present in developing flexible corpus and database materials for research and teaching purposes. I have taught linguistic field methods on a regular basis using Filemaker but am interested in learning more about the initiatives under discussion at this workshop with a view to using this technology in the future.

Edward Garrett, Charlottesville, Virginia, USA
University of Virginia
Email: eg3p@virginia.edu
Web: http://faculty.virginia.edu/tibet-initiative/library/resources/adrdp/frameset.html
Data Types: Metadata, Word list, Lexicon, Annotated signal, Writing system, Interlinear text, Linguistic description
Functions: Store, Create, Display, Query
I am working on a project at UVA which documents colloquial Tibetan. We are digitizing videos, transcribing, translating, and developing software tools for language learning and analysis.

David Golumbia, New York City, New York, USA
Independent Scholar
Email: dgolumbi@panix.com
Web: http://www.mindspring.com/~dgolumbi/docs/cv.html
Data Types: Metadata, Lexicon, Annotated signal, Writing system, Interlinear text, Paradigm
Functions: Store, Display, Query
I am a consultant to the East Cree Interactive Grammar project (http://www.eastcree.org), and also a cultural studies scholar and software/web developer, and am very interested in becoming more involved in projects to make indigenous and endangered languages more widely available and represented in new technologies, especially the web.

K. David Harrison, Philadelphia, Pennsylvania, USA
Yale (Endangered Language Fund) and UPenn/IRCS
Email: kdh2@linc.cis.upenn.edu
Web=http://sapir.ling.yale.edu/~ASLEP/ASLEP.htm
Data Types: Metadata, Lexicon, Annotated signal, Writing system, Field notes
Functions: Store, Display
Currently working on an ethnographic and linguistic documentation of several endangered Siberian languages and cultures, within the framework of the DOBES project (funded by Volkswagen-Stiftung).

Benjamin Hary, Atlanta, Georgia, USA
Emory University
Email: bhary@emory.edu
Data Types: Word list, Lexicon, Writing system
Functions: Create, Query
I am the prinipal investigator of CoSIH, the Corpus of Spoken Israeli Hebrew, with Shlomo Izre'el of Tel Aviv University who will also attend the workshop.

Johannes Helmbrecht, Erfurt, Germany
University of Erfurt
Email: johannes.helmbrecht@uni-erfurt.de
Data Types: Word list, Lexicon, Annotated signal, Interlinear text, Paradigm, Field notes, Description, Common
Functions: Store, Create, Convert, Display, Query
I am working on the documentation of Hochunk (Winnebago) a highly endangered Siouan language spoken in Wisconsin, USA.

Wallace Hooper, Bloomington, Indiana, USA
Indiana University
Email: whooper@indiana.edu
Data Types: Metadata, Word list, Lexicon, Annotated signal, Interlinear text, Paradigm, Field notes, Description, Common
Functions: Store, Create, Convert, Display, Query
Creation of a flexible, XML-based interlinear-text and dictionary processor

Chu-Ren Huang, Taipei, Taiwan
Academia Sinica
Email: churen@sinica.edu.tw
Data Types: Word list, Lexicon, Interlinear text
Functions: Store, Create, Convert, Query
Over the past 12 years, I directed (or co-directed) and completed the following Chinese language resources: 1) CKIP lexicon (>80k entries), 2) Sinica Corpus (5 million words, balanced and tagged), 3) Academia Sinica Archaic Chinese Corpus (5 millions characters, roughly 300-0 BC), 4) Academia Sinica Classical Chinese Corpora, Sinica Treebank (>30,000 sentences).
In the past two years, I was involved in the NSC Digital Library/Museum Initiative of Taiwan and will be completing a second linguistic and literary knowledge site in November.
Currently, I have just initiated a group of three-year projects targeting English-Chinese and Chinese-English bilingual Wordnet, and eventually, a Chinese Wordnet. A part of the project is affiliated with the NSF-NSC IDLP collaboration.
Starting next year, I will be directing the corpus part of a new National Digital Archive project in Taiwan. The corpora targeted include a balanced corpus of 20th century Mandarin Chinese, a diachronic corpus of near modern Chinese, and corpora of Formosan (Austronesian) languages.

Shlomo Izre'el, Tel Aviv, Israel
Tel Aviv University
Email: izreel@post.tau.ac.il
Web: http://spinoza.tau.ac.il/hci/dep/semitic/izreel.html
Data Types: Metadata, Annotated signal, Interlinear text, Linguistic description
Functions: Store, Create, Convert, Display, Query
The Corpus of Spoken Israeli Hebrew (http://spinoza.tau.ac.il/hci/dep/semitic/cosih.html). Electronic publication of ancient Semitic languages (Akkadian, Canaano-Akkadian) ( http://spinoza.tau.ac.il/hci/dep/semitic/amarna.html).

Michel Jacobson, Villejuif, France
CNRS/LACITO
Email: jacobson@idf.ext.jussieu.fr
Web: http://195.83.92.32/
Data Types: Metadata, Word list, Lexicon, Annotated signal, Writing system, Interlinear text
Functions: Store, Create, Convert, Display, Query
I am a computer/linguist working on structured text data formats for texts, lexicons, and comparative phonological data. I make and/or use tools for creating, browsing and querying time-aligned speech or video data.

Aravind Joshi, Philadelphia, Pennsylvania, USA
University of Pennsylvania
Email: joshi@linc.cis.upenn.edu
Web: http://www.cis.upenn.edu/~joshi
Data Types: Metadata, Word list, Lexicon, Annotated signal, Paradigm, Description
Functions: Store, Create, Convert, Display, Query
Natural language processing. Linguisitc, computational, statistical, and psycholinguisitc aspects of language processing

Alex Kasonde, Atlanta, Georgia, USA
Emory University
Email: alex.kasonde@learnlink.emory.edu
Data Types: Word list, Lexicon, Annotated signal, Writing system, Paradigm, Description
Functions: Store, Create, Display, Query
Right now I am working on a the corpora of Icibemba (Guthrie : M.42), the major language language of Zambia. This project involves the transcription, translation and digitalization of oral data mostly radio broadcast of different generic categories recorded on tape. This project is intended primarily for teaching and research but the wider public can learn to utilize it.
As a member of the European Association of Lexicography (Euralex), I am primarily interested in computerized lexicography. This includes word lists, lexicons, terminologies, dictionaries and encyclopaedias. Of special interest to me is the utilization of graphics (video effects) in illustrated versions (photos, drawings, paintings, tables and graphs etc). The prospect of adding audio effects to graphics is also particularly exciting for an encylopedia of African etnomusicological generic types.

Ulrike Kiefer, Lampertheim, Germany
Foerderverein fuer Jiddische Sprache und Kultur, Duesseldorf
Email: kiefer@rhein-neckar.netsurf.de
Web: http://www.eydes.org
Data Types: Metadata, Word list, Lexicon, Annotated signal, Interlinear text, Field notes
Functions: Store, Convert, Display, Query
We employ language engineering methods for preserving, exploiting and disseminating the information inherent in the Language and Culture Atlas of Ashkenazic Jewry. This collection consists of 6000 hours of spoken language data primarily in the Yiddish language with parts in at least half a dozen other languages, including English, Hebrew, French, German, Hungarian, and Romanian. Furthermore, the contents of the interviews refer to the knowledge of other cultural spheres making this a particularly rich data resource, both from a linguistic and cultural point of view. The medium for dissemination and access to the collection is the Internet. The data is presented in the original sound linked to transcripts and a series of annotation layers. The electronic archive is being built as a bi-medial database. (sound/transcripts/indexes).

Sung-Hyuk Kim, Seoul, Korea
Sookmyung Women's University
Email: ksh@sookmyung.ac.kr
Web: http://lis.sookmyung.ac.kr/~ksh
Data Types: Metadata, Word list, Lexicon
Functions: Create
I have involved many digital libraries projects and text encoding projects. My major are doucment structuring using SGML and XML for Web. Recently I am preparing digital libraries project on Korean Culture and Heritage with Tufts University. For this project, I am going to develop bilingual parallel corpus and bilingual dictionary for cross information retrieval between Korean and English. Also, I have involved standardization activities for electronic commerce such as electronic document and catalog using XML.

Tom Klingler, New Orleans, Louisiana, USA
Tulane University
Email: klingler@tulane.edu
Web: http://www.tulane.edu/~klingler/
Data Types: Lexicon, Writing system, Field notes, Linguistic description
Functions: Store, Create, Display, Query
I have been engaged for the past ten years in the documentation and description of Louisiana Creole, an endangered French-lexifier creole spoken in south Louisiana. My recorded corpus consists of nearly 200 hours of audio tapes of variable quality (the best are in DAT format), as well as approximately forty hours of videotapes that are of very high quality. Most of the recordings consist of one-on-one interviews with Creole speakers.
I am co-author (with Albert Valdman, Margaret M. Marshall, and Kevin J. Rottet) of a dictionary of Louisiana Creole and am currently completing a grammatical description of the variety spoken in Pointe Coupee Parish. I am interested in exploring ways to enhance the presentation and analysis of my data in digital formats.

Kai-Uwe Kuehnberger, Tuebingen, Germany
University of Tübingen, Germany (Theoretical Computational Linguistics)
Email: kaiuwe@sfs.nphil.uni-tuebingen.de
Data Types: Annotated signal, Field notes, Description
Functions: Store, Create
The department of Theoretical Computational Linguistics (University of Tübingen ) is in the process of initiating a project developing formal frameworks in order to model hypertexts and natural language annotations. We think that the natural model for the representation of data of structured texts and annotation graphs are certain classes of coalgebras. From a theoretical perspective we will try to find complexity theoretic results, concerning these classes of coalgebras. The long-term perspective is to develop a formal semantics for hypertexts and natural language annotations. Furthermore, we will exemplify these theoretical results using applications

Sadao Kurohasi, Kyoto, Japan
Kyoto University
Email: kuro@i.kyoto-u.ac.jp
Data Types: Metadata, Word list, Lexicon, Interlinear text
Functions: Store, Create, Convert
I have been developing several Japanese language resources, including JUMAN (morphological analyzer), KNP (dependency analyzer), and Kyoto University Corpus (40,000 sentences with syntactic tags).

Alan Lee, Philadelphia, Pennsylvania, USA
Linguistic Data Consortium
Email: aleewk@babel.ling.upenn.edu
Web: http://www.ldc.upenn.edu/exploration/expl2000
I am a graduate student in Linguistics at Penn, and am helping out with the organization of this workshop.

Paola Monachesi, Utrecht, The Netherlands
Utrecht University Email: Paola.Monachesi@let.uu.nl
Web: http://www-uilots.let.uu.nl/~Paola.Monachesi/personal
Data Types: Word list, Interlinear text, Paradigm, Linguistic description
Functions: Store, Create, Convert, Display, Query
I am coordinating a new project launched in the Netherlands which aims at the creation of a `Typological Database System'. The project will combine (and extend) existing typological databases with new ones to be created. The ultimate goal is to develop a system that is able to link separate databases in such a way that we can ask questions over the whole set of databases. To this end, a meta-language will be developed. The metadatabase is intended to be part of a `Language Typology Resource Centre', a web-accessible electronic archive for typological description, including powerful research tools such as grammars, typological databases and language-typological expert systems.

Fenson Mwape, Takadacho, Japan
Osaka University of Foreign Studies / University of Zambia
Email: mwape1970@yahoo.com
Data Types: Word list, Field notes, Description
Functions: Store
I work on the minority languages of Zambia focusing on the Bemba group of languages. My work mainly nvolves documenting these hitherto unwritten languages.

Toshihide Nakayama, Upper Montclair, New Jersey, USA
Montclair State University
Email: nakayamat@alpha.montclair.edu
Data Types: Word list, Lexicon, Interlinear text, Field notes
Functions: Store, Create, Display, Query
I have been working on Nuu-chah-nulth (a.k.a. Nootka: Wakashan). Over the years I have accumulated a fair amount of text and lexical materials. I am very much interested in the idea of on-line grammar, texts and lexicon, and I am exploring good ways to build online texts (audio, vidio, as well as written materials) and hyperlinked grammar and lexicon.

Robert Neumann, Mannheim, Germany
Foerderverein fuer Jiddische Sprache und Kultur, e.V., Duesseldorf
Email: robert.neumann@rhein-neckar.netsurf.de
Web: http://www.eydes.org
Data Types: Metadata, Word list, Lexicon, Annotated signal, Interlinear text, Field notes
Functions: Store, Convert, Display, Query
We employ language engineering methods for preserving, exploiting and disseminating the information inherent in the Language and Culture Atlas of Ashkenazic Jewry. The collection consists of 6000 hours of spoken language data primarily in the Yiddish language with parts in at least a half dozen other languages, including English, Hungarian, French, German, Hebrew, and Romanian. Furthermore, the contents of the interviews refer to the knowledge of other cultural spheres making this a particularly rich data resource, both from a linguistic and cultural point of view. The medium for dissemination and access to the collection is the internet. The data is presented in the original sound, linked to transcripts and a series of annotation layers. The electronic archive is being built as an bimedial database (sound/transcipts/indexes).

Douglas Parks, Bloomington, Indiana, USA
Indiana University
Email: parksd@indiana.edu
Web: http://www.indiana.edu/~aisri
Data Types: Metadata, Lexicon, Annotated signal, Writing system, Interlinear text, Paradigm, Field notes, Description
Functions: Store, Create, Convert, Display, Query
A linguist who is documenting two Caddoan and two Siouan languages and preparing teaching materials for three of them. That work has necessitated creation of software programs that support both documentation (archiving) and dissemination in both printed and web formats.

Eleanor Robson, Oxford, UK
Electronic Text Corpus of Sumerian Literature, University of Oxford
Email: eleanor.robson@all-souls.ox.ac.uk
Web: http://www-etcsl.orient.ox.ac.uk
Data Types: Metadata, Word list, Lexicon, Linguistic description, Common
Functions: Store, Convert, Display, Query
Sumerian is one of the world's oldest written languages, and is still poorly understood. I am one of a small team who have been documenting, editing, translating and publishing an SGML-XML corpus of Sumerian literary texts from ca. 2000-1600 BC. We are now planning ways to analyse that corpus in order to document and describe aspects of its style, lexis, grammar and register.

Laurent Romary, Nancy, France
Laboratoire Loria
Email: Laurent.Romary@loria.fr
Data Types: Lexicon
Functions: Convert
Involved in TEI related projects since 1994. Contribution to the reference annotation module in MATE. Project Leader of future ISO 16642 for Terminological Data Collection representation in XML.

Pat Ryckman, Charlotte, North Carolina, USA
UNC Charlotte
Email: plryckma@email.uncc.edu
Web: http://libweb.uncc.edu/archives
Data Types: Metadata, Annotated signal, Interlinear text, Common
Functions: Store, Create, Convert, Display, Queryt
We are developing a cross-disciplinary digital sound archive as a research and teaching resource for the University and community. We plan to digitize, transcribe and encode, using XML, a collection of oral interviews and narratives for dissemination on the World Wide Web.

Mike Sangrey, Landisburg, Pennsylvania, USA
The Bantu Initiative
Email: msangrey@pa.net
Data Types: Metadata, Word list, Lexicon, Field notes
Functions: Store, Create, Display, Query
The `Bantu Initiative', headed by Roger Van Otterloo, seeks to capture the lexicon and grammar across the Bantu language landscape. It is currently in the envisioning stage of development and my involvement is currently being solidified; however, the intention is for me to bring my software development expertise (some XML based), and some linguistic skills to the project. I am also co-moderator of two Bible translation lists.

Harold Schiffman, Philadelphia, Pennsylvania, USA
University of Pennsylvania
Email: haroldfs@ccat.sas.upenn.edu
http://ccat.sas.upenn.edu/~haroldfs
Data Types: Lexicon
Functions: Create
Since 1984, I have been compiling an English Dictionary of the Tamil Verb, consisting of around 20,000 entries consisting of English verb, Tamil equivalent (in spoken and in written Tamil), verb class(es), example sentences (spoken and written), and synonyms. Next stage is to add sound files for the spoken entries, and publish it on CD-ROM. Sample entries can be viewed at: http://ccat.sas.upenn.edu/plc/dictionary

Kiyoaki Shirai, Tokyo, Japan
Tokyo Institute of Technology
Email: kshirai@cl.cs.titech.ac.jp
Web: http://tanaka-www.cs.titech.ac.jp/%7Ekshirai/
Data Types: Lexicon, Interlinear text
Functions: Store, Create
In recent years, I have concerned with the research project to develop the text annotated with POS tags and word sense tags. For the definition of word senses, we use the Iwanami Dictionary which is the published Japanese Dictionary. Now it is under active consideration to use this corpus for Japanese SENSEVAL.

Andrew Simpson, Hayward, California, USA
CBOLD/University of California, Berkeley
Email: aksimpso@socrates.berkeley.edu
Data Types: Lexicon
Functions: Store, Convert
I am working with the Comparative Bantu Online Dictionary, working out issues concerning the markup of dictionary materials for use in a database to be used, among other things, for comparative linguistic work. Currently, I'm working on the markup of a Ganda dictionary.

Stavros Skopeteas, Erfurt, Germany
University of Erfurt
Email: stavros.skopeteas@uni-erfurt.de
Data Types: Lexicon, Interlinear text, Paradigm, Field notes, Description
Functions: Store, Create, Query
Databases for lexical material, syntactic constructions, functional descriptions, typological data etc.; elicitation of speech via non-linguistic task games; situational conditioning of language documentation

Debra Spitulnik, Atlanta, Georgia, USA
Emory University
Email: dspitul@learnlink.emory.edu
Web: http://www.emory.edu/COLLEGE/ANTHROPOLOGY/FACULTY/ANTDS/
Data Types: Metadata, Lexicon, Annotated signal, Interlinear text, Paradigm, Field notes, Description
Functions: Store, Create, Convert, Display, Query
Bemba, a member of the Bantu language family, is the most widely spoken language in Zambia (approx. 6 million speakers). I am developing a linguistic database to archive and analyze naturally occurring discourse collected in the field from a variety of speech genres (e.g. oratory, song, conversation, newscasts). Research questions derive mainly from sociolinguistics and linguistic anthropology, and include questions about (1) Bemba language change in the context of high multilingualism and high prestige English and (2) how oral traditions are transformed in electronic media. Collaborators on the project are working on other areas of grammar and lexicography.
Bemba Home page: http://www.emory.edu/COLLEGE/ANTHROPOLOGY/FACULTY/ANTDS/Bemba/
Digital Polyglot: http://www.emory.edu/COLLEGE/LINGUISTICS/POLYGLOT

Pirkko Suihkonen, Leipzig, Germany
Max Planck Institute for Evolutionary Anthropology
Email: suihkonen@eva.mpg.de
Web: http//www.ling.helsinki.fi/~suihkone/
Data Types: Metadata, Lexicon, Interlinear text, Common
Functions: Store, Create, Convert, Display
I have been working with documentation of electronic linguistic data, especially with data located on the University of Helsinki Language Corpus Server (http://www.ling.helsinki.fi/uhlcs). Right now, my work at the MPI-EVA, Leipzig, concerns developing meta-descriptions to be used as standards in multilingual and multimodal linguistic studies.

Sheri Tatsch, Davis, California, USA
Second Language Acquisition Institute, UC Davis
Email: sjtatsch@ucdavis.edu
Web: http://slai.ucdavis.edu/about.htm
Data Types: Word list, Lexicon, Annotated signal, Writing system, Interlinear text, Paradigm, Field notes
Functions: Store, Create, Display
SLAI's focus for our present project is the development of a single web site with a comprehensive listing of materials and on-line resources, listing of all institutuons and center of instruction, with identification of fluent speakers willing to mentor the learner working with self-instruction materials of the world's less commonly taught languages. In addition we are purchasing learning aids for these languages, now being deposited in the Language Learning Center here at UC Davis. We are concerned with the lack of standards for data formatting and the inability to locate with ease current information on evolving research. As for myself, I am beginning work with the Nisenan people of Northern California who at this have no fluent speakers left. Documentation of what strands of language are left is imperative. data creation and archives would be a blessing for this work and, of course, access to this data is a fundamental. I have been concerned with standardized form in my approach to documenting and subsequently rebuilding everyday language use. Production of a dictionary and grammer with a writting system for Nisenan-ne is a goal.

Antonio Zampolli, Ghezzano, Italy
Istituto di Linguistica Computazionale - CNR
Email: eagles@ilc.pi.cnr.it
Web: http://www.ilc.pi.cnr.it
Data Types: Metadata, Lexicon, Interlinear text, Description
Functions: Store, Create, Convert, Display, Query
European Coordinator of the ISLE/EAGLES project. Previous responsibility in standardization of text corpora and lexicons. Design and creation of large computational lexicons with morphological, syntactic and seman tic information. Annotation of text corpora at various levels of linguistic description.


Return to Index