Proceedings of the IRCS Workshop on Linguistic Databases

11-13 December 2001
University of Pennsylvania, Philadelphia, USA

Organized by Steven Bird, Peter Buneman and Mark Liberman
Funded by the National Science Foundation


I. Aldezabal, O. Ansa, B. Arrieta, X. Artola,
A. Ezeiza, G. Hernández,
M. Lersundi

EDBL: a general lexical basis for the automatic
processing
of Basque   [pdf]

p.1-10
Anthony Aristar,                       
Helen Aristar-Dry
The E-MELD Project    [pdf] p.11-16
X. Artola, A. Soroa  An Architecture for a Federation of Highly
Heterogeneous
Lexical Information Sources [pdf]
p.17-23

Masayuki Asahara,                   
Ryuichi Yoneda,                          
Yuji Matsumoto

Use of a Relational Database in the Development and Maintenance of Linguistic Resources for Statistical
Japanese Morphological Analysis  
[pdf]
p.24-31
Sonya Bird, Melody Jeffcoat      
Michael Hammond
Electronic Dictionaries for Languages of the Southwest [pdf] p.32-37
Heather Bliss,                            
Elizabeth Ritter  
Developing a Database of Personal and Demonstrative
Pronoun Paradigms:
Conceptual and Technical Challenges [pdf]
p.38-47
Daan Broeder, Freddy Offenga,   
Don Willems, Peter Wittenburg
The IMDI Metadata set, its Tools and accessible Linguistic Databases   [pdf] p.48-55
Dunstan Brown      Constructing a typological database for inflectional
morphology:
  the SMG database for syncretism [pdf]
p.56-64
Hennie Brugman,                      
Peter Wittenburg
The application of annotation models for the
construction of databases
and tools: Overview and analysis of MPI work since 1994  [pdf]
p.65-73
Angelo Dalli          Interoperable Extensible Linguistic Databases  [pdf] p.74-81
Jeff Good, Ronald Sprouse    Creating a database and query-tools for the TELL
multi-speaker linguistic corpus
  [pdf]
p.82-91
JorgeGurlekian,                        
Laura Colantoni,                       
Humberto Torres,
Hernan Rodríguez,
Antonio Rincón,
Asunción Moreno,
José Mariño
Database for Automatic Speech Recognition System
for Argentine Spanish   
[pdf]
p.92-98
Jorge Gurlekian,                         
Hernan Rodríguez,
Laura Colantoni,
Humberto Torres
Development of a Prosodic Database for an Argentine Spanish Text to Speech System    [pdf] p.99-104
Jan Hajic, Barbora Hladka,       
Petr Pajas
The Prague Dependency Treebank: Annotation
Structure and Support 
p.105-114
Larry Hayashi,  John Hatton    

Combining UML, XML and relational database
technologies - the best of all
worlds for robust
linguistic databases    [pdf]

p.115-124
Andrew Hippisley,                     
Mariam Tariq,
David Cheng
Hierarchical data and the derivational relationship
between words  
[msword]
p.125-133
Martin Holub,                            
Pavel Mika
MATES - an experimental linguistic database system [pdf] p.134-140
Nancy Ide, Laurent Romary    Standards for Language Resources    [pdf] p.141-149
Heidi Johnson The Archive of Indigenous Languages of Latin America [msword] p.273-283
William Kretzschmar  Linguistic Databases of the American Linguistic Atlas Project (ALAP)   [pdf] p.157-166
William Lewis,                        
Scott Farrar,
D. Terence Langendoen
Building a Knowledge Base of Morphosyntactic Terminology   [pdf] p.150-156
Christopher Manning,              
Kristen Parton
What's needed for lexical databases? Experiences with Kirrkirr   [pdf] p.167-173
Jan-Torsten Milde,                  
Ulrike Gut
The TASX-environment: an XML-based corpus database for time aligned language data    [pdf] p.174-180
Paola Monachesi,                    
Alexis Dimitriadis,
Rob Goedemans,
Anne-Marie Mineur,
and Manuela Pinto
The Typological Database System  [pdf] p.181-186
Uwe Monnich,                         
Frank Morawietz,
Stephan Kepser
A Regular Query for Context-Sensitive Relations [pdf] p.187-195
Simon Musgrave  A Brief Description of the Spinoza Typological Database   [pdf] p.196-199
Brad Penoff, Chris Brew  TREX-Q: A query language based on XML Schema  [pdf] p.200-209
Brian Roark      Storing automatically generated treebanks in lattices of derivations   [ps] p.210-218
Thomas Schmidt  The transcription system EXMARaLDA: an application
of the annotation
graph formalism as the basis of a database of multilingual spoken discourse   [pdf]
p.219-227
Elke Teich,                               
Silvia Hansen,
Peter Fankhauser
Representing and querying multi-layer corpora [pdf] p.228-237
John Thomson Representing Multilingual and Annotated Text in Memory
and in a Relational Database [msword]
p.263-272
Thorsten Trippel,                       
Dafydd Gibbon
PAX - an annotation based concordancing toolkit  [pdf] p.238-244
R.J.J.H. van Son,                      
Louis C.W. Pols
Structure and access of the open source IFA-corpus[pdf] p.245-253
Martin Wynne    

Writing a Corpus Cookbook  [msword]

p.254-262

Steven Bird, Peter Buneman, & Mark Liberman (LDC, CIS, & Linguistics)
Email: sb@ldc.upenn.edu, peter@cis.upenn.edu, myl@unagi.cis.upenn.edu