ORGANIZERS
Steven Bird, Philadelphia, Pennsylvania,
USA
University of Pennsylvania
Email: sb@ldc.upenn.edu
Web: http://www.ldc.upenn.edu/sb
Data Types: Metadata, Word list, Lexicon, Annotated signal, Writing
system, Interlinear text, Paradigm, Field notes, Description, Common
Functions: Store, Create, Convert, Display, Query
For the last 15 years I have been developing data models and software
tools in support of linguistic research. From 1995-97 I worked in
Cameroon, documenting and analyzing the tone systems of the Grassfields
Bantu languages, and developing computational methods to support the
work. An online example is available at http://www.ldc.upenn.edu/sb/fieldwork/.
I am associate director of the
Linguistic Data Consortium,
and I am a principal investigator on
two NSF projects (Linguistic Exploration, TalkBank) providing
computational infrastructure for empirical work in linguistics and in the
social sciences more generally.
Gary Simons, Dallas, Texas, USA
SIL International
Email: gary_simons@sil.org
Web: http://www.sil.org/SIL/roster/simons.htm
Data Types: Metadata, Word list, Lexicon, Annotated signal, Writing system, Interlinear text, Paradigm, Field notes,
Description, Common
Functions: Store, Create, Convert, Display, Query
For fifteen years (1984-1999) as Director of Academic Computing for
SIL International I was involved in directing a number of projects
that developed software to assist field linguists in documenting and
describing language (IT, CELLAR, LinguaLinks, FieldWorks). In my
current position as Associate VP for Academic Affairs I'm still
involved in oversight of this area as well as of our efforts to
launch an on-line language archive. During the development of the
Text Encoding Initiative's guidelines for text markup, I was involved
as a member of the Committee on Text Analysis and Interpretation and
of the Technical Review Committee.
PRESENTERS AND PANELISTS
Helen Aguera, Washington, DC, USA
National Endowment for the Humanities
Email: HAguera@neh.gov
Web: http://www.neh.gov
Data Types: Metadata, Word list, Lexicon, Interlinear text, Field notes,
Description
Functions: Store, Create, Convert, Display, Query
I am a program officer in the NEH Division of Preservation and
Access. The Division's programs supports various types of projects
related to language documentation and description. Awards fund the
following actitivies: preparation of dictionaries, grammars, and
corpora; reformatting of archival materials in order to preserve
them; establishement of intellectual access to materials in archives
and special collections; research and demonstration projects to
develop standards and best practices and to enchance the use of
digital technology toprovide access to humanities resources.
Eric Albright, Duncanville, Texas,
USA
SIL International
Email: eric_albright@sil.org
Data Types: Writing system
Functions: Store
I am currently involved in Thesis research pertaining to the
description of writing systems. I have been envolved in structured
markup (SGML, XML) technology for about 5 years.
Jonathan Amith, Dallas, Oregon, USA
Yale University
Email: jonathan.amith@yale.edu
Data Types: Lexicon, Annotated signal, Interlinear text, Paradigm, Field note,
Description
Functions: Store, Display, Query
The elaboration of a trilingual lexicon of Nahuatl (to Spanish and English)
including detailed fields (among others) on grammatic
function, morphology, semantic field, inflectional patterns, and roots. The writing
of a pedagogically oriented reference grammar for
modern Nahuatl, the electronic version of which will be linked to the lexicon.
Creation of learning exercises and texts for use in the
classroom and that will be made interactive and placed online. Elaboration of a
corpus of narrative material including songs, prayers,
life histories, conversation, etc. An interest in having the ability to link
lexicon, grammar, and corpus online to facilitate rapid
movement from one "section" to another.
Anthony Aristar, Ann Arbor, Michigan, USA
Eastern Michigan University
Email: aristar@linguistlist.org
Data Types: Metadata
Functions: Store
I am a co-moderater of The LINGUIST List, which is configuring a
database to store linguistics-related metadata. The database will
also store a limited amount of endangered languages data, but the
focus of our activities is collecting and providing metadata to the
discipline.
Helen Aristar-Dry, Ann Arbor,
Michigan, USA
Eastern Michigan University
Email: hdry@linguistlist.org
Data Types: Metadata
Functions: Store
I am a co-moderater of The LINGUIST List, which is configuring a
database to store linguistics-related metadata. The database will
also store a limited amount of endangered languages data, but the
focus of our activities is collecting and providing metadata to the
discipline.
Neal Audenaert, College Station, Texas,
USA
Texas A&M University
Email: neala@tamu.edu
Data Types: Lexicon, Common
Functions: Store, Create, Query
I am currently the primary investigator for a software development
project attempting to create a system to support archiving and
analyzing linguistic data. This system, the Language Data
Repository, will support distributed access to linguistic data over a
network, be that a local intranet, or the Internet, and host third
party tools to support data analysis.
John Bell, Philadelphia, Pennsylvania, USA
University of Pennsylvania
Email: jmbell@babel.ling.upenn.edu
Data Types: Lexicon
Functions: Store
We have made a study involving approximately 55 dictionaries,
investigating what sort of formats occur in them. From this we have
begun to develop a general model of lexical entries, as well as trying
to find what elements are universal in lexical entries in
dictionaries. We have created a number of sample entries in XML
format.
Daan Broeder, Nijmegen, The
Netherlands
Max Planck Instute for Psycholinguistic
Email: Daan.Broeder@mpi.nl
Data Types: Metadata, Word list, Lexicon, Annotated signal, Interlinear
text, Field notes, Description
Functions: Store, Create, Convert, Display, Query
Designing schema and implementing tools with respect to the
description of language resources for several projects especially within
the ISLE/EAGLES project. Influential participation in the TIDEL
development team for the DOBES project and for anthropological and
linguistical multi-media resources at the MPI.
Jean Carletta, Edinburgh, Scotland
Language Technology Group and Human Communication Research Centre,
University of Edinburgh
Email: jeanc@cogsci.ed.ac.uk
Web: http://www.cogsci.ed.ac.uk/~jeanc
Data Types: Annotated signal, Interlinear text, Paradigm, Field notes,
Description
Functions: Store, Create, Convert, Display, Query
The Language Technology Group has been developing technologies which
support language corpora annotated at multiple, non-hierarchically
structured levels, using XML/XSL with stand-off annotation. We have
particular interests in technologies for hand and automatic data
annotation and are currently beginning to explore ways of improving
support for working with data by making it possible to configure
coding, analysis, and display interfaces using graphical user
interfaces.
Peter Constable, Dallas, Texas, USA
SIL International
Email: peter_constable@sil.org
Web: http://www.sil.org
Data Types: Metadata, Writing system, Linguistic description
Functions: Store, Create, Convert, Display
For the past few years, I have been working as part of a team within
SIL conducting research to develop solutions for working with non-
Roman writing systems on computers. Overall, this effort has been
looking at issues of character encoding, encoding conversion,
keyboard input, complex-script rendering, and fonts. In terms of
software platforms, the focus has been on solutions for Microsoft
Windows, but also for the Mac OS. Areas of particular focus for me
have included the Unicode standard and, more recently, language
identifiers.
Sharon Correll, Dallas, Texas, USA
SIL International
Email: Sharon_Correll@sil.org
Data Types: Word list, Lexicon, Writing system, Interlinear text, Field
notes, Description, Common
Functions: Display
I'm involved in developing a text-rendering system called Graphite
that can be used for complex scripts. Graphite is programmable, so
it can be extended to handle varieties of writing system that are not
handled by system software (such as Uniscribe).
Megan Crowhurst, Austin, Texas, USA
University of Texas at Austin
Email: mcrowhurst@mail.utexas.edu
Web: http://uts.cc.utexas.edu/~crowhurs/index.html
Data Types: Word list, Interlinear text, Field notes, Description
Functions: Store, Display, Query
I have been engaged in field research on Tupi-Guarani languages spoken
in Bolivia, especially Guarayu (a language spoken by approximately
7000-8000 people in eastern Bolivia) since 1996. At present, I am
serving as the Chair of the Linguistic Society of America's Committee
on Endangered Languages and their Preservation.
Dafydd Gibbon, Bielefeld, Germany
Universität Bielefeld
Email: gibbon@spectrum.uni-bielefeld.de
Web: http://coral.lili.uni-bielefeld.de/~gibbon/
Data Types: Metadata, Word list, Lexicon, Annotated signal, Paradigm,
Field notes, Description
Functions: Store, Create, Convert, Display, Query
My special interests with regards to language description are varied, but
focus on the modelling of prosody, particularly in West African tone
languages, and on computational lexicography for spoken language, with
applications both to language documentation and speech technology. In
the language documentation domain, I am just finishing a 4 year
cooperation project with Université de Cocody, Abidjan,
Côte d'Ivoire (1997-2000), on designing an encyclopedia for
Ivorian languages, and have just started a 1 year pilot project "Ega:
a documantation model for an endangered Ivorian language" within the
DOBES consortium (Dokumentation bedrohter Sprachen - Documentation
ofendangered languages) funded by the Volkswagen foundation, also in
cooperation with the Université de Cocody, in which my project
partners are Dr Firmin Ahoua, Cocody, and Dr Bruce Connell Oxford.
Currently I am also working with Dr Eno-Abasi Urua of University of
Uyo, Nigeria, on the prosody of languages of the Cross River region of
Nigeria.
Jeff Good, Berkeley, California, USA
University of California, Berkeley/CBOLD
Email: jcgood@socrates.berkeley.edu
Data Types: Metadata, Word list, Lexicon, Common
Functions: Store, Create, Convert, Display, Query
Working at the Comparative Bantu Online Dictionary (CBOLD), I am
involved in preparing data sources for online use and in working out
the problems of how to turn a set of dictionaries and word lists into
a comparative database--i.e., linking entries to historical
reconstructions and linking together cognates across our sources.
Susan Hockey, London, UK
School of Library, Archive and Information Studies, UCL
Email: s.hockey@ucl.ac.uk
Web: http://www.ucl.ac.uk/slais/staff/shockey.html
Data Types: Metadata, Word list, Writing system, Description, Common
Functions: Store, Create, Display, Query
In a career in humanities computing that is now longer than I dare to
admit, I have been involved in text analysis software development
(COCOA and OCP), tools for the display of non-standard characters
(Greek, Arabic, Hebrew), the Text Encoding Initiative (as a Member of
the Steering Committee and of the Text Representation Committee),
metadata research (cataloging electronic texts at CETH (Rutgers and
Princeton), and the development of the Dublin Core), and electronic
publishing of a text base with SGML encoding of literary
interpretation (the Orlando Project at the Universities of Alberta
and Guelph). I am now a Professor of Library and Information Studies
teaching and researching in humanities information management. I am
interested in exploring how tools and techniques from humanities
computing can be brought together with established techniques and
standards in library and archive studies to develop Internet-based
electronic resources to serve the needs of humanities scholarship and
teaching.
Gary Holton, Fairbanks, Alaska, USA
Alaska Native Language Center
Email: gary.holton@uaf.edu
Web: http://www.uaf.edu/anlc
Data Types: Metadata, Word list, Lexicon, Field notes
Functions: Store, Create, Display
I am a descriptive linguist with field work experience in Alaska and
eastern Indonesia. I am interested in using the internet to make
archival documentation materials on Alaska Native languages available
in digital form and to permit input of data by other field workers.
Nancy Ide, Poughkeepsie, New York, USA
Vassar College
Email: ide@cs.vassar.edu
Web: http://www.cs.vassar.edu/~ide
Data Types: Metadata, Lexicon
Functions: Store, Create, Convert, Query
Development of EAGLES standard encoding and annotation formats for
linguistic corpora in XML (XCES), in particular for morpho-syntactic
encoding, parallel alignment, computational lexicons, and syntactic
annotation. Development of a standard general annotation formalism and
framework to support an EAGLES annotation repository.
Mark Liberman, Philadelphia, Pennsylvania,
USA
University of Pennsylvania
Email: myl@cis.upenn.edu
Web: http://www.ling.upenn.edu/~myl
Data Types: Metadata, Lexicon, Annotated signal, Interlinear text, Paradigm, Field notes
Functions: Store, Create, Convert, Display, Query
See http://www.ling.upenn.edu/~myl for description of activities.
Saturnino Luz, Odense, Denmark
University of Southern Denmark
Email: luzs@acm.org
Web: http://tec.ccl.umist.ac.uk/
Data Types:Metadata
Functions: Store
I've collaborated in the TEC (Translational English Corpus) defining
an architecture for access to the corpus over the internet and
implementing most of the client-server software.
Kazuaki Maeda, Philadelphia,
Pennsylvania, USA
LDC, University of Pennsylvania
Email: maeda@ldc.upenn.edu
Data Types: Metadata, Word list, Annotated signal, Interlinear text
Functions: Store, Create, Convert, Display
I am a programmer working with Steven Bird at LDC. I am also a
graduate student at Penn working in the areas of speech technologies
and phonetics research.
Anne Mahoney, Medford, Massachusetts,
USA
Perseus Project. Tufts University
Email: amahoney@perseus.tufts.edu
Web: http://www.perseus.tufts.edu
Data Types: Metadata, Lexicon, Paradigm
Functions: Store, Create, Convert, Display, Query
Our digital library includes lexica, grammars, and morphological
analyzers for the languages we deal with. We work on automatic
information extraction (automated markup), and are starting to work on
cross-language information retrieval.
Elena Maslova, San Francisco, California,
USA
University of Bielefeld
Email: Maslova@jps.net
Data Types: Metadata, Lexicon, Annotated signal, Interlinear text,
Description
Functions: Store, Convert, Display
A system for linguistic analysis of text data (Paradox,
ObjectPal), intended primarily for agglutinative languages (in
St.Petersbug Institute of Linguistic Research, Paleo-Siberian
Department, in cooperation with Eugene Levin). A general "Language
Description System", intended as a framework for computer-based
descriptive grammars (HyperCard, a project in University of
Bielefeld, lead by Christian Lehmann). A system for analysis and
transcription of acoustic records (Paradox, Delphi, in cooperation
with Eugene Levin). A system for analysis, representation and
indexation of texts (Java, University of Leiden). A descriptive
grammar of Yukaghir.
Mike Maxwell, Waxhaw, North Carolina, USA
SIL
Email: Mike_Maxwell@sil.org
Data Types: Word list, Lexicon, Annotated signal, Writing system, Interlinear
text, Paradigm, Description
Functions: Create, Display, Query
I work in the areas of computational morphology and phonology. WRT
the conference, my interests are in methods for rapid documentation
of phonological, grammatical and lexical analyses of languages, and
in making those analyses and the intermediate data used to create
them permanently available in searchable electronic form.
Patrick McConvell, Canberra,
Australia
Australian Institute of Aboriginal and Torres Strait Islander
Studies
Email: patrick@aiatsis.gov.au
Web: http://www.aiatsis.gov.au
Data Types: Word list, Lexicon, Interlinear text
Functions: Store, Convert, Query
I have been appointed to the position of Research Fellow, Language and
Society at AIATSIS, Canberra, this year. I have a background in
anthropology and linguistics and have worked on description of
Australian indigenous languages, and in bilingual education, language
maintenance intervention and in training of indigenous language
workers. My own current research is mainly related to language shift
and maintenance, and language, culture and history, but I also deal
with issues of documentation on behalf of the research section
of AIATSIS, especially the needs of indigenous clients and their
organisations. AIATSIS, in addition to holding a large print and
audio-visual collection has developed a digital archive, ASEDA, mainly
devoted to Australian indigenous language documentation.
Anthony McEnery, Lancaster, UK
Lancaster University
Email: mcenery@comp.lancs.ac.uk
Web: http://www.ling.lancs.ac.uk/staff/tony/tony.htm
Data Types: Writing system
Functions: Convert
Largely concerned with harmonising multiple representations of South
Asian writing systems into UNICODE. Especially interested in doing
that within the context of the General Architecture for Text
Engineering (GATE).
Lev Michael, Austin, Texas, USA
University of Texas at Austin
Email: lmichael@mail.utexas.edu
Web: http://uts.cc.utexas.edu/~ailla
Data types: Metadata, Interlinear text, Field notes, Linguistic description
Functions: Create, Display, Query
As a member of the team that is developing the Archive of the
Indigenous Languages of Latin America (AILLA), I am involved in
ethnographic and discourse typological research, and the implementation
of the resulting taxonomy in the archive's metadata structures, and in
user-friendly search-interfaces. I am alsoconducting ongoing research
on Nanti, an Arawakan language with roughly 500 speakers in
southeastern Peru. I am presently concentrating on basic descriptions
of Nanti phonology, prosody, and morphology, and on documentation of
everyday discourse.
Boyd Michailovsky, Villejuif,
France
LACITO, CNRS, France
Email: Boyd.Michailovsky@vjf.cnrs.fr
Data Types: Metadata, Lexicon, Annotated signal, Writing system, Interlinear text, Paradigm, Description
Functions: Store, Create, Convert, Display, Query
I am a general linguist working mainly on Tibeto-Burman languages of the Himalayan
region and using structured text data formats for texts, lexicons, and comparative
phonological data. Recently I have participated in the design of a research-oriented
archive of time-aligned speech on the web (http://lacito.vjf.cnrs.fr/archivage.htm);
about 90 minutes of texts in Hayu and Limbu (Tibeto-Burman languages of Nepal) are
currently available for browsing and limited query.
Michael Nelson, Hampton, Virginia, USA
NASA LaRC & University of North Carolina
Email: mln@ils.unc.edu
Web: http://www.ils.unc.edu/~mln/
Data Types: Metadata
Functions: Store, Create, Display
Digital libraries: Open Archive Initiative (OAI) and the Smart Object,
Dumb Archive (SODA) model.
Nicholas Ostler, Bath, England
Foundation for Endangered Languages
Email: nostler@chibcha.demon.co.uk
Web: http://www.ogmios.org, http://www.chibcha.demon.co.uk
Data Types: Lexicon, Interlinear text, Description, Common
Functions: Create, Query
As President of FEL and editor of its newsletter Ogmios, I encourage
submission of proposals for documentation work relevant to endangered
languages, and give coverage to work that is current. I am also
currently documenting the south-american languages, Muisca (Chibcha)
and Uwa (Tunebo). As principal consultant at Linguacubun, I work on
European-, US- & UK-funded projects in language description (especially
for machine translation).
Martha Palmer, Philadelphia, PA, USA
University of Pennsylvania
Email: mpalmer@cis.upenn.edu
Web: http://www.cis.upenn.edu/~mpalmer
Data Types: Metadadta, Word list, Lexicon
Functions: Store, Create, Convert, Display, Query
American Coordinator for ISLE, International Standards for Language Engineering (joint
NSF/EU project) PI for the Penn Chinese Treebank: 100K of Chinese words annotated for
segmentation, pos tagging, and syntactic bracketing
Bill Poser
Email: Bill_Poser@telus.net
Joel Sherzer, Austin, Texas, USA
University of Texas
Email: jsherzer@mail.utexas.edu
Data Types: Metadata, Interlinear text
Functions: Create, Display, Query
I have been carrying out research among the Kuna Indians of Panama
since 1970. This involves the archiving and documentation of forms of
discourse. Recently I have been involved with the AILLA project to
archive forms of Latin American indigenous discourse on the web.
Ronald Sprouse, Berkeley, California,
USA
UC Berkeley
Email: ronald@uclink.berkeley.edu
Data Types: Metadata, Word list, Lexicon, Interlinear text
Functions: Store, Create, Display, Query
I have worked for several years as Technical Director of the Ingush
project at UC Berkeley, in which we are compiling a dictionary,
grammar, and set of interlinearized texts. I have developed group
collaboration software for collecting and annotating texts, as well as
a basic markup standard for these texts. In addition, I have worked
with the Comparative Bantu On-Line Dictionary project at UC Berkeley,
in which we are attempting to provide a uniform query mechanism to a
set of heterogeneous data sources, particularly word lists and
lexicons, from a large set of Bantu languages.
Richmond Thomason, Ann Arbor, Michigan, USA
University of Michigan
Email: rich@thomason.org
Web: http://www.eecs.umich.edu/~rthomaso/
Data Types: Lexicon
Functions: Convert, Query
Reseach Interests in Artificial Intelligence, especially in Computational Linguistics and in Artificial Intelligence. I have assisted in the development of lexical materials for Montana Salish.
Steve Tinney, Philadelphia, Pennsylvania,
USA
AMES, University of Pennsylvania
Email: stinney@sas.upenn.edu
Data Types: Metadata, Word list, Lexicon, Writing system, Interlinear
text
Functions: Store, Create, Convert, Display, Query
Linguistic, literary and cultural research on Sumerian and
Mesopotamia. Co-Director, Pennsylvania Sumerian Dictionary Project.
I am heavily involved in computerizing the PSD, developing an XML
framework for integration of primary texts, tools like signlists and
the lexicon.
David Weber, Westmoreland, NY, USA
SIL International
Email: david_weber@sil.org
Data Types: Lexicon, Writing system, Interlinear text, Field notes
Functions: Store, Create, Convert, Display
I have written a grammar of Huallaga (Huanuco) Quechua, coauthored one
on Bora (a Witotoa language spoken in Peru and Colombia), and am
currently coauthoring on a grammar of Arabela (the last Zaparoan
language, spoken by fewer than 100 people in northern Peru). I
coauthored a dictionary of Huallaga Quechua (with equivalents,
translations and indices in both Spanish and English).
Steven Weinberger, Fairfax, Virginia,
USA
George Mason University
Email: weinberg@gmu.edu
Web: http://mason.gmu.edu/~weinberg
Data Types: Metadata, Annotated signal, Linguistic description
Functions: Display, Query
Steven Weinberger is chief investigator and administrator of the speech
accent archives (http://classweb.gmu.edu/accent), a
repository of non- native english speech and native dialects of english.
Douglas Whalen, New Haven, Connecticut, USA
Endangered Language Fund
Email: whalen@haskins.yale.edu
Web: http://macserver.haskins.yale.edu/haskins/STAFF/whalen.html
Data Types: Metadata, Word list, Interlinear text
Functions: Store, Create, Convert, Display, Query
The Endangered Language Fund is beginning an online collection of
material in the Algonquian languages. The first page (on Maliseet) has
some initial material (http://www.ling.yale.edu/~elf/maliseet1.html).
Our hope is to make searching for reflexes of Algonquian roots easy.
Peter Wittenburg, Nijmegen,
The Netherlands
Max Planck Institute for Psycholinguistics
Email: Peter.Wittenburg@mpi.nl
Web: http://www.mpi.nl
Data Types: Metadata, Word list, Lexicon, Annotated signal, Interlinear
text, Field notes, Description
Functions: Store, Create, Convert, Display, Query
Head of the development and archiving activities for multimedia
language resources at the MPI. Responsible person for the DOBES
project for Documenting Endangered Languages. Participation in the
EAGLES/ISLE project for defining a proposal for Meta Descriptions for
Multimedia Language Resources.
OTHER PARTICIPANTS (Latest Update)
John Alderete, Swarthmore, Pennsylvania, USA
Swarthmore College
Email: jaldere1@swarthmore.edu
Web: http://www.swarthmore.edu/SocSci/jaldere1/ling_jaldere1.html
Data Types: Word list, Lexicon, Annotated signal, Interlinear text, Paradigm, Field
notes, Description
Functions: Store, Create, Convert, Display, Query
I work on the northern Athabaskan language Tahltan. I'm doing primary linguistic
description of the sound inventory,
morpho-phonemics, and the structure of verb words. In collaboration with Tanya Bob
and Patricia Shaw at the University of British
Columbia, I'm involved in a variety of corpus construction activities, including
transcribing large sound files with fieldnotes and texts,
and the construction of a lexical database.
Eva Banik, Philadelphia, Pennsylvania, USA
Linguistic Data Consortium
Email: ebanik@babel.ling.upenn.edu
Web: http://www.mentha.hu/~vica
Data Types: Metadata
Functions: Query
Developing data and service providers for language data.
Robert Beard, Lewisburg, Pennsylvania, USA
Bucknell University/yourDictionary.com
Email: rbeard@yourdictionary.com
Web: http://www.yourdictionary.com
Data Types: Word list, Lexicon
Functions: Store, Create, Convert, Display, Query
I have just launched an e-business called "yourDictionary.com" which
will soon start publishing grammars and dictionaries on-line. One of
our initiatives will be our "Endangered Language Repository" which
will offer free webhosting for grammars and dictionaries of endangered
languages.
John Albert Bickford, Catalina, Arizona, USA
SIL-Mexico
Email: albert_bickford@sil.org
Data Types: Metadata, Lexicon, Writing system, Interlinear text, Field
notes, Description
Functions: Store, Create, Convert, Display, Query
Linguistics editor for SIL-Mexico website and webmaster for NDSIL
website, both of which are a venue for publication of linguistic
materials.
Jack Cain, Toronto, Canada
Multilingual E-Data Solutions (Multedata)
Email: jcain@multedata.ca
Web: http://www.multedata.ca
Data Types: Word list, Lexicon, Writing system
Functions: Store, Create, Convert, Display, Query
Our small consulting firm has been active for two years in assisting
the new Canadian Arctic territory of Nunavut in implementing Canadian
Aboriginal syllabics. A web-based collaborative but administered
Inuktitut "Living Dictionary" is one of main products we have
designed.
Nicoletta Calzolari Zamorani, Ghezzano, Italy
Istituto di Linguistica Computazionale - CNR
Email: glottolo@ilc.pi.cnr.it
Web: http://www.ilc.pi.cnr.it
Data Types: Metadata, Lexicon, Interlinear text, Description
Functions: Store, Create, Convert, Display, Query
European Responsible of the ISLE/EAGLES Computational Lexicon Working Group for standardiza
tion of multilingual lexicons. Previous responsibility in standardization of morphosyntax, syntax
(subcategorization) and semantics in computational lexicons. Design and creation of large
computational lexicons with morphological, syntactic and semantic information. Annotation of text
corpora at various levels of linguistic description.
Khalid Choukri, Paris, France
ELRA/ELDA
Email: choukri@elda.fr
Web: http://www.elda.fr
Data Types: Metadata, Word list, Lexicon
Functions: Store, Create, Convert
Managing director of the European Langage Resources Distribution Agency (ELDA)
Christopher Cieri, Philadelphia, Pennsylvania, USA
Linguistic Data Consortium
Email: ccieri@ldc.upenn.edu
Web: http://www.ldc.upenn.edu
Data Types: Metadata, Word list, lexicon, Annotated signal, Writing system,
Interlinear text, Paradigm, Field notes, Description, Common
Functions: Store, Create, Convert, Display, Query
Executive Director of the Linguistic Data Consortium, presented at the first
Linguistic Exploration workshop on New Methods for Creating, Exploring and
Disseminating Linguistic Field Data, continuing work on the description of regional varieties of Italian.
Robert Cox, Philadelphia, Pennsylvania, USA
American Philosophical Society
Email: rscox@amphilsoc.org
Data Types: Metadata, Field notes, Description
Functions: Store, Create, Convert, Display, Query
Practicing archivist with oversight over a large collection of Native
American sound recordings and other language data.
Sean Crist, Landsowne, Pennsylvania, USA
University of Pennsylvania
Email: kurisuto@unagi.cis.upenn.edu
Web: http://www.ling.upenn.edu/~kurisuto
Data Types: Metadata, Word list, Lexicon, Writing system, Interlinear text, Paradigm,
Description
Functions: Store, Create, Convert, Display, Query
I'm working on creating online materials on the early Germanic
languages (Gothic, Old English, Old Icelandic, and Old High German; see http://www.ling.upenn.edu/~kurisuto/germanic/language_resources.html).
My long-term goal is to create a comprehensive comparative database of
these languages and to experiment with automated language
reconstruction techniques.
Michael Dukes, Stanford, California, USA
Stanford University & University of Canterbury
Email: mdukes@stanford.edu
Data Types: Metadata, Word list, Lexicon, Annotated signal, Writing
system, Interlinear text, Paradigm, Field notes, Description, Common
Functions: Store, Create, Convert, Display, Query
My main areas of interest are in Austronesian linguistics, focussing
on morphosyntax. I am particularly interested at present in developing
flexible corpus and database materials for research and teaching
purposes. I have taught linguistic field methods on a regular basis
using Filemaker but am interested in learning more about the
initiatives under discussion at this workshop with a view to using
this technology in the future.
Edward Garrett, Charlottesville, Virginia, USA
University of Virginia
Email: eg3p@virginia.edu
Web: http://faculty.virginia.edu/tibet-initiative/library/resources/adrdp/frameset.html
Data Types: Metadata, Word list, Lexicon, Annotated signal, Writing system, Interlinear text,
Linguistic description
Functions: Store, Create, Display, Query
I am working on a project at UVA which documents colloquial Tibetan.
We are digitizing videos, transcribing, translating, and developing
software tools for language learning and analysis.
David Golumbia, New York City, New York, USA
Independent Scholar
Email: dgolumbi@panix.com
Web: http://www.mindspring.com/~dgolumbi/docs/cv.html
Data Types: Metadata, Lexicon, Annotated signal, Writing system, Interlinear text, Paradigm
Functions: Store, Display, Query
I am a consultant to the East Cree Interactive Grammar project (http://www.eastcree.org), and also a cultural studies
scholar and software/web developer, and am very interested in becoming more involved in
projects to make indigenous and endangered languages more widely available and represented
in new technologies, especially the web.
K. David Harrison, Philadelphia, Pennsylvania, USA
Yale (Endangered Language Fund) and UPenn/IRCS
Email: kdh2@linc.cis.upenn.edu
Web=http://sapir.ling.yale.edu/~ASLEP/ASLEP.htm
Data Types: Metadata, Lexicon, Annotated signal, Writing system, Field
notes
Functions: Store, Display
Currently working on an ethnographic and linguistic documentation of
several endangered Siberian languages and cultures, within the
framework of the DOBES project (funded by Volkswagen-Stiftung).
Benjamin Hary, Atlanta, Georgia, USA
Emory University
Email: bhary@emory.edu
Data Types: Word list, Lexicon, Writing system
Functions: Create, Query
I am the prinipal investigator of CoSIH, the Corpus of Spoken
Israeli Hebrew, with Shlomo Izre'el of Tel Aviv University who will
also attend the workshop.
Johannes Helmbrecht, Erfurt, Germany
University of Erfurt
Email: johannes.helmbrecht@uni-erfurt.de
Data Types: Word list, Lexicon, Annotated signal, Interlinear text,
Paradigm, Field notes, Description, Common
Functions: Store, Create, Convert, Display, Query
I am working on the documentation of Hochunk (Winnebago) a highly
endangered Siouan language spoken in Wisconsin, USA.
Wallace Hooper, Bloomington, Indiana, USA
Indiana University
Email: whooper@indiana.edu
Data Types: Metadata, Word list, Lexicon, Annotated signal, Interlinear
text, Paradigm, Field notes, Description, Common
Functions: Store, Create, Convert, Display, Query
Creation of a flexible, XML-based interlinear-text and dictionary
processor
Chu-Ren Huang, Taipei, Taiwan
Academia Sinica
Email: churen@sinica.edu.tw
Data Types: Word list, Lexicon, Interlinear text
Functions: Store, Create, Convert, Query
Over the past 12 years, I directed (or co-directed) and completed the
following Chinese language resources:
1) CKIP lexicon (>80k entries),
2) Sinica Corpus (5 million words, balanced and tagged),
3) Academia Sinica Archaic Chinese Corpus (5 millions characters,
roughly 300-0 BC),
4) Academia Sinica Classical Chinese Corpora, Sinica Treebank
(>30,000 sentences).
In the past two years, I was involved in the NSC Digital
Library/Museum Initiative of Taiwan and will be completing a second
linguistic and literary knowledge site in November.
Currently, I have just initiated a group of three-year
projects targeting English-Chinese and Chinese-English bilingual
Wordnet, and eventually, a Chinese Wordnet. A part of the project is
affiliated with the NSF-NSC IDLP collaboration.
Starting next year, I will be directing the corpus part of
a new National Digital Archive project in Taiwan. The corpora
targeted include a balanced corpus of 20th century Mandarin Chinese,
a diachronic corpus of near modern Chinese, and corpora of Formosan
(Austronesian) languages.
Shlomo Izre'el, Tel Aviv, Israel
Tel Aviv University
Email: izreel@post.tau.ac.il
Web: http://spinoza.tau.ac.il/hci/dep/semitic/izreel.html
Data Types: Metadata, Annotated signal, Interlinear text, Linguistic description
Functions: Store, Create, Convert, Display, Query
The Corpus of Spoken Israeli Hebrew (http://spinoza.tau.ac.il/hci/dep/semitic/cosih.html).
Electronic publication of ancient Semitic languages (Akkadian, Canaano-Akkadian)
(
http://spinoza.tau.ac.il/hci/dep/semitic/amarna.html).
Michel Jacobson, Villejuif, France
CNRS/LACITO
Email: jacobson@idf.ext.jussieu.fr
Web: http://195.83.92.32/
Data Types: Metadata, Word list, Lexicon, Annotated signal, Writing
system, Interlinear text
Functions: Store, Create, Convert, Display, Query
I am a computer/linguist working on structured text data formats for
texts, lexicons, and comparative phonological data. I make and/or use
tools for creating, browsing and querying time-aligned speech or
video data.
Aravind Joshi, Philadelphia, Pennsylvania, USA
University of Pennsylvania
Email: joshi@linc.cis.upenn.edu
Web: http://www.cis.upenn.edu/~joshi
Data Types: Metadata, Word list, Lexicon, Annotated signal, Paradigm, Description
Functions: Store, Create, Convert, Display, Query
Natural language processing. Linguisitc, computational, statistical,
and psycholinguisitc aspects of language processing
Alex Kasonde, Atlanta, Georgia, USA
Emory University
Email: alex.kasonde@learnlink.emory.edu
Data Types: Word list, Lexicon, Annotated signal, Writing system,
Paradigm, Description
Functions: Store, Create, Display, Query
Right now I am working on a the corpora of Icibemba (Guthrie : M.42),
the major language language of Zambia. This project involves the
transcription, translation and digitalization of oral data mostly radio
broadcast of different generic categories recorded on tape. This project
is intended primarily for teaching and research but the wider public can
learn to utilize it.
As a member of the European Association of Lexicography (Euralex), I
am primarily interested in computerized lexicography. This includes word
lists, lexicons, terminologies, dictionaries and encyclopaedias. Of
special interest to me is the utilization of graphics (video effects) in
illustrated versions (photos, drawings, paintings, tables and graphs
etc). The prospect of adding audio effects to graphics is also
particularly exciting for an encylopedia of African etnomusicological
generic types.
Ulrike Kiefer, Lampertheim, Germany
Foerderverein fuer Jiddische Sprache und Kultur, Duesseldorf
Email: kiefer@rhein-neckar.netsurf.de
Web: http://www.eydes.org
Data Types: Metadata, Word list, Lexicon, Annotated signal, Interlinear
text, Field notes
Functions: Store, Convert, Display, Query
We employ language engineering methods for preserving, exploiting and
disseminating the information inherent in the Language and Culture
Atlas of Ashkenazic Jewry. This collection consists of 6000 hours of
spoken language data primarily in the Yiddish language with parts in
at least half a dozen other languages, including English, Hebrew,
French, German, Hungarian, and Romanian. Furthermore, the contents of
the interviews refer to the knowledge of other cultural spheres making
this a particularly rich data resource, both from a linguistic and
cultural point of view. The medium for dissemination and access to the
collection is the Internet. The data is presented in the original
sound linked to transcripts and a series of annotation layers. The
electronic archive is being built as a bi-medial database.
(sound/transcripts/indexes).
Sung-Hyuk Kim, Seoul, Korea
Sookmyung Women's University
Email: ksh@sookmyung.ac.kr
Web: http://lis.sookmyung.ac.kr/~ksh
Data Types: Metadata, Word list, Lexicon
Functions: Create
I have involved many digital libraries projects and text encoding
projects. My major are doucment structuring using SGML and XML for
Web. Recently I am preparing digital libraries project on Korean
Culture and Heritage with Tufts University. For this project, I am
going to develop bilingual parallel corpus and bilingual dictionary
for cross information retrieval between Korean and English.
Also, I have involved standardization activities for electronic
commerce such as electronic document and catalog using XML.
Tom Klingler, New Orleans, Louisiana, USA
Tulane University
Email: klingler@tulane.edu
Web: http://www.tulane.edu/~klingler/
Data Types: Lexicon, Writing system, Field notes, Linguistic description
Functions: Store, Create, Display, Query
I have been engaged for the past ten years in the documentation and
description of Louisiana Creole, an endangered French-lexifier creole
spoken in south Louisiana. My recorded corpus consists of nearly 200
hours of audio tapes of variable quality (the best are in DAT
format), as well as approximately forty hours of videotapes that are
of very high quality. Most of the recordings consist of one-on-one
interviews with Creole speakers.
I am co-author (with Albert Valdman, Margaret M. Marshall, and Kevin
J. Rottet) of a dictionary of Louisiana Creole and am currently
completing a grammatical description of the variety spoken in Pointe
Coupee Parish. I am interested in exploring ways to enhance the
presentation and analysis of my data in digital formats.
Kai-Uwe Kuehnberger, Tuebingen, Germany
University of Tübingen, Germany (Theoretical Computational Linguistics)
Email: kaiuwe@sfs.nphil.uni-tuebingen.de
Data Types: Annotated signal, Field notes, Description
Functions: Store, Create
The department of Theoretical Computational Linguistics (University
of Tübingen ) is in the process of initiating a project developing
formal frameworks in order to model hypertexts and natural language
annotations. We think that the natural model for the representation of
data of structured texts and annotation graphs are certain classes of
coalgebras. From a theoretical perspective we will try to find complexity
theoretic results, concerning these classes of coalgebras. The long-term
perspective is to develop a formal semantics for hypertexts and natural
language annotations. Furthermore, we will exemplify these theoretical
results using applications
Sadao Kurohasi, Kyoto, Japan
Kyoto University
Email: kuro@i.kyoto-u.ac.jp
Data Types: Metadata, Word list, Lexicon, Interlinear text
Functions: Store, Create, Convert
I have been developing several Japanese language resources, including
JUMAN (morphological analyzer), KNP (dependency analyzer), and Kyoto
University Corpus (40,000 sentences with syntactic tags).
Alan Lee, Philadelphia, Pennsylvania, USA
Linguistic Data Consortium
Email: aleewk@babel.ling.upenn.edu
Web: http://www.ldc.upenn.edu/exploration/expl2000
I am a graduate student in Linguistics at Penn, and am helping out with the
organization of this workshop.
Paola Monachesi, Utrecht, The Netherlands
Utrecht University
Email: Paola.Monachesi@let.uu.nl
Web: http://www-uilots.let.uu.nl/~Paola.Monachesi/personal
Data Types: Word list, Interlinear text, Paradigm, Linguistic description
Functions: Store, Create, Convert, Display, Query
I am coordinating a new project launched in the Netherlands which aims
at the creation of a `Typological Database System'. The project will
combine (and extend) existing typological databases with new ones to
be created. The ultimate goal is to develop a system that is able to
link separate databases in such a way that we can ask questions over
the whole set of databases. To this end, a meta-language will be
developed. The metadatabase is intended to be part of a `Language
Typology Resource Centre', a web-accessible electronic archive for
typological description, including powerful research tools such as
grammars, typological databases and language-typological expert systems.
Fenson Mwape, Takadacho, Japan
Osaka University of Foreign Studies / University of Zambia
Email: mwape1970@yahoo.com
Data Types: Word list, Field notes, Description
Functions: Store
I work on the minority languages of Zambia focusing on the Bemba
group of languages. My work mainly nvolves documenting these hitherto
unwritten languages.
Toshihide Nakayama, Upper Montclair, New Jersey, USA
Montclair State University
Email: nakayamat@alpha.montclair.edu
Data Types: Word list, Lexicon, Interlinear text, Field notes
Functions: Store, Create, Display, Query
I have been working on Nuu-chah-nulth (a.k.a. Nootka: Wakashan).
Over the years I have accumulated a fair amount of text and lexical
materials. I am very much interested in the idea of on-line grammar,
texts and lexicon, and I am exploring good ways to build online texts
(audio, vidio, as well as written materials) and hyperlinked grammar
and lexicon.
Robert Neumann, Mannheim, Germany
Foerderverein fuer Jiddische Sprache und Kultur, e.V., Duesseldorf
Email: robert.neumann@rhein-neckar.netsurf.de
Web: http://www.eydes.org
Data Types: Metadata, Word list, Lexicon, Annotated signal, Interlinear
text, Field notes
Functions: Store, Convert, Display, Query
We employ language engineering methods for preserving, exploiting and
disseminating the information inherent in the Language and Culture
Atlas of Ashkenazic Jewry. The collection consists of 6000 hours of
spoken language data primarily in the Yiddish language with parts in
at least a half dozen other languages, including English, Hungarian,
French, German, Hebrew, and Romanian. Furthermore, the contents of the
interviews refer to the knowledge of other cultural spheres making
this a particularly rich data resource, both from a linguistic and
cultural point of view. The medium for dissemination and access to
the collection is the internet. The data is presented in the
original sound, linked to transcripts and a series of annotation layers. The
electronic archive is being built as an bimedial database
(sound/transcipts/indexes).
Douglas Parks, Bloomington, Indiana, USA
Indiana University
Email: parksd@indiana.edu
Web: http://www.indiana.edu/~aisri
Data Types: Metadata, Lexicon, Annotated signal, Writing system,
Interlinear text, Paradigm, Field notes, Description
Functions: Store, Create, Convert, Display, Query
A linguist who is documenting two Caddoan and two Siouan languages and
preparing teaching materials for three of them. That work has
necessitated creation of software programs that support both
documentation (archiving) and dissemination in both printed and
web formats.
Eleanor Robson, Oxford, UK
Electronic Text Corpus of Sumerian Literature, University of Oxford
Email: eleanor.robson@all-souls.ox.ac.uk
Web: http://www-etcsl.orient.ox.ac.uk
Data Types: Metadata, Word list, Lexicon, Linguistic description, Common
Functions: Store, Convert, Display, Query
Sumerian is one of the world's oldest written languages, and is still
poorly understood. I am one of a small team who have been documenting,
editing, translating and publishing an SGML-XML corpus of Sumerian
literary texts from ca. 2000-1600 BC. We are now planning ways to
analyse that corpus in order to document and describe aspects of its
style, lexis, grammar and register.
Laurent Romary, Nancy, France
Laboratoire Loria
Email: Laurent.Romary@loria.fr
Data Types: Lexicon
Functions: Convert
Involved in TEI related projects since 1994. Contribution to the
reference annotation module in MATE. Project Leader of future ISO 16642
for Terminological Data Collection representation in XML.
Pat Ryckman, Charlotte, North Carolina, USA
UNC Charlotte
Email: plryckma@email.uncc.edu
Web: http://libweb.uncc.edu/archives
Data Types: Metadata, Annotated signal, Interlinear text, Common
Functions: Store, Create, Convert, Display, Queryt
We are developing a cross-disciplinary digital sound archive as a research and
teaching resource for the University and community.
We plan to digitize, transcribe and encode, using XML, a collection of oral
interviews and narratives for dissemination on the World
Wide Web.
Mike Sangrey, Landisburg, Pennsylvania, USA
The Bantu Initiative
Email: msangrey@pa.net
Data Types: Metadata, Word list, Lexicon, Field notes
Functions: Store, Create, Display, Query
The `Bantu Initiative', headed by Roger Van Otterloo, seeks to capture
the lexicon and grammar across the Bantu language landscape. It is
currently in the envisioning stage of development and my involvement
is currently being solidified; however, the intention is for me to
bring my software development expertise (some XML based), and some
linguistic skills to the project. I am also co-moderator of two Bible
translation lists.
Harold Schiffman, Philadelphia, Pennsylvania, USA
University of Pennsylvania
Email: haroldfs@ccat.sas.upenn.edu
http://ccat.sas.upenn.edu/~haroldfs
Data Types: Lexicon
Functions: Create
Since 1984, I have been compiling an English Dictionary of the
Tamil Verb, consisting of around 20,000 entries consisting of English
verb, Tamil equivalent (in spoken and in written Tamil), verb
class(es), example sentences (spoken and written), and synonyms. Next
stage is to add sound files for the spoken entries, and publish it on
CD-ROM. Sample entries can be viewed at: http://ccat.sas.upenn.edu/plc/dictionary
Kiyoaki Shirai, Tokyo, Japan
Tokyo Institute of Technology
Email: kshirai@cl.cs.titech.ac.jp
Web: http://tanaka-www.cs.titech.ac.jp/%7Ekshirai/
Data Types: Lexicon, Interlinear text
Functions: Store, Create
In recent years, I have concerned with the research project to develop
the text annotated with POS tags and word sense tags. For the
definition of word senses, we use the Iwanami Dictionary which is the
published Japanese Dictionary. Now it is under active consideration to
use this corpus for Japanese SENSEVAL.
Andrew Simpson, Hayward, California, USA
CBOLD/University of California, Berkeley
Email: aksimpso@socrates.berkeley.edu
Data Types: Lexicon
Functions: Store, Convert
I am working with the Comparative Bantu Online Dictionary, working out issues
concerning the markup of dictionary materials for use in a database to be used, among
other things, for comparative linguistic work. Currently, I'm working on the markup of a
Ganda dictionary.
Stavros Skopeteas, Erfurt, Germany
University of Erfurt
Email: stavros.skopeteas@uni-erfurt.de
Data Types: Lexicon, Interlinear text, Paradigm, Field notes, Description
Functions: Store, Create, Query
Databases for lexical material, syntactic constructions, functional
descriptions, typological data etc.; elicitation of speech via
non-linguistic task games; situational conditioning of language
documentation
Debra Spitulnik, Atlanta, Georgia, USA
Emory University
Email: dspitul@learnlink.emory.edu
Web: http://www.emory.edu/COLLEGE/ANTHROPOLOGY/FACULTY/ANTDS/
Data Types: Metadata, Lexicon, Annotated signal, Interlinear text,
Paradigm, Field notes, Description
Functions: Store, Create, Convert, Display, Query
Bemba, a member of the Bantu language family, is the most widely
spoken language in Zambia (approx. 6 million speakers). I am developing
a linguistic database to archive and analyze naturally occurring
discourse collected in the field from a variety of speech genres (e.g.
oratory, song, conversation, newscasts). Research questions derive
mainly from sociolinguistics and linguistic anthropology, and include
questions about (1) Bemba language change in the context of high
multilingualism and high prestige English and (2) how oral traditions are
transformed in electronic media. Collaborators on the project are
working on other areas of grammar and lexicography.
Bemba Home
page: http://www.emory.edu/COLLEGE/ANTHROPOLOGY/FACULTY/ANTDS/Bemba/
Digital Polyglot: http://www.emory.edu/COLLEGE/LINGUISTICS/POLYGLOT
Pirkko Suihkonen, Leipzig, Germany
Max Planck Institute for Evolutionary Anthropology
Email: suihkonen@eva.mpg.de
Web: http//www.ling.helsinki.fi/~suihkone/
Data Types: Metadata, Lexicon, Interlinear text, Common
Functions: Store, Create, Convert, Display
I have been working with documentation of electronic linguistic data,
especially with data located on the University of Helsinki Language
Corpus Server (http://www.ling.helsinki.fi/uhlcs). Right now, my
work at the MPI-EVA, Leipzig, concerns developing meta-descriptions to
be used as standards in multilingual and multimodal linguistic
studies.
Sheri Tatsch, Davis, California, USA
Second Language Acquisition Institute, UC Davis
Email: sjtatsch@ucdavis.edu
Web: http://slai.ucdavis.edu/about.htm
Data Types: Word list, Lexicon, Annotated signal, Writing system, Interlinear text, Paradigm,
Field notes
Functions: Store, Create, Display
SLAI's focus for our present project is the development of a single
web site with a comprehensive listing of materials and on-line
resources, listing of all institutuons and center of instruction, with
identification of fluent speakers willing to mentor the learner
working with self-instruction materials of the world's less commonly
taught languages. In addition we are purchasing learning aids for
these languages, now being deposited in the Language Learning Center
here at UC Davis. We are concerned with the lack of standards for data
formatting and the inability to locate with ease current information
on evolving research. As for myself, I am beginning work with the
Nisenan people of Northern California who at this have no fluent
speakers left. Documentation of what strands of language are left is
imperative. data creation and archives would be a blessing for this
work and, of course, access to this data is a fundamental. I have been
concerned with standardized form in my approach to documenting and
subsequently rebuilding everyday language use. Production of a
dictionary and grammer with a writting system for Nisenan-ne is a
goal.
Antonio Zampolli, Ghezzano, Italy
Istituto di Linguistica Computazionale - CNR
Email: eagles@ilc.pi.cnr.it
Web: http://www.ilc.pi.cnr.it
Data Types: Metadata, Lexicon, Interlinear text, Description
Functions: Store, Create, Convert, Display, Query
European Coordinator of the ISLE/EAGLES project.
Previous responsibility in standardization of text corpora and lexicons.
Design and creation of large computational lexicons with morphological, syntactic and seman
tic information. Annotation of text corpora at various levels of linguistic description.