The transcription system EXMARaLDA: an application of the annotation graph formalism as the basis of a database of multilingual spoken discourse Thomas Schmidt University of Hamburg The SFB "Mehrsprachigkeit" (Research Center on Multilingualism) at the University of Hamburg brings together linguists doing research on multilingualism from a variety of theoretical perspectives. The majority of the projects work with spoken language data, and the formats, tools and platforms these data come in are as diverse as the backgrounds of the research groups  there are data in thirteen different languages (German, English, French, Italian, Spanish, Portuguese, Basque, Luganda, Turkish, Japanese, Norwegian, Swedish, Danish), created and processed on different platforms (Windows, Macintosh, Linux) and with different transcription conventions (Verbmobil, HIAT and others) and transcription tools (syncWriter, HIAT-DOS, LAPSUS (dBase), conventional text editors etc.). The systems in use are for the most part outdated and technically as well as conceptionally incompatible with one another. The urgent need for an exchange format between them is therefore obvious. The project "Database Multilingualism" not only aims at providing such an exchange format. It also takes care of the implementation of appropriate tools for input and output of the data for different purposes and on different platforms. Its main activity until now has been the development of the discourse transcription system EXMARaLDA (EXtensible MARkup Language for Discourse Annotation). Based on the annotation graph formalism, EXMARaLDA defines a way of coding discourse transcriptions in XML in a way that is independent of both linguistic theory and concrete graphical representation and that is able to comprise, and mediate between, the various formats in use at the SFB. At the same time, EXMARaLDA provides mechanisms and tools for user-friendly input and output of the data that reflect the needs of the individual projects. In particular, tools for input and output of discourse transcriptions in musical score notation (German: Partiturformat) have been designed and implemented. As a next step, the existing data at the SFB will all be converted into the EXMARaLDA format. The result of this conversion process will be a number of XML-coded corpora that can be easily exchanged between researchers and adapted to their respective needs. The ultimate goal of the project "Database Multilingualism" is the implementation of a database that brings together all the multilingual data at the SFB and make them available for an integrated search and analysis.