A Regular Query for Context-Sensitive Relations Uwe Monnich, Frank Morawietz and Stephan Kepser Seminar fur Sprachwissenschaft Universitat Tubingen The present paper tries to exploit the recent convergence of studies in the fields of database technology and computational linguistics. Complicating the task of integrating data types from diverse sources of linguistic documentation is the fact that natural languages exhibit a property that makes it impossible for them to be accommodated within the limits of core XML, the standard format for data exchange on the Web. As has been noticed since the beginning of formal language theory certain grammatical phenomena like morphological congruences (e.g., Bambara) and cross-serial dependencies between case markings (Swiss German) are outside the realm of context-free languages and need for their decriptive analysis a (limited) amount of contextual information. Due to this character of context-sensitivity that comes with a range of grammatical constructions, even monadic second-order logic (MSO), considered as a powerful query language, is too weak to capture these phenomena. Taking our inspiration from universal algebra we regard grammatical categories as basic constants of a many-sorted algebra with a distinguished set of composition and projection symbols. Through the explicit introduction of these operation symbols it becomes possible to turn the data models of contemporary syntactic theories into a kind of labeled tree structures that can either be generated by regular tree grammars or are indentifiable with collections of finite trees specifiable by formulae of MSO logic. In the particular case of the verbal complex of Swiss German mentioned above it is easy to describe in MSO terms the two verbal and nominal clusters, respectively. What is problematic from the point of view of regularity is the set of fixed syntactic and semantic relations between the verbal elements and their case-marked arguments. In other words, an MSO specification of these bipartite structures would return - regarding the MSO specification as a yes/no query - structures that do not satisfy the particular set of cross-serial dependencies characteristic of the instance of context-sensitivity under discussion. Despite this lack of expressive power of the chosen query language the algebraic approach adumbrated above is remarkably effective in filtering out the syntactic "noise" from the query result. Since the explicit algebraic structures are elements of a regular family of trees it is again easy to produce an MSO formula that characterizes exactly these cross-serial dependencies among the explicit structures that were out of the reach of the query language, on the intended linguistic level. It takes then only linear time to project those explicit structures that comply with the set of syntactic and semantic relations onto the data model where the original query was formulated. It turns out that this method of regularizing queries of context-sensitive structures can be adopted to all grammatical phenomena that fall within the reach of current linguistic theories (Kolb et al.; 2000a,b; Michaelis et al.; 2000). References Kolb, H.-P., Michaelis, J., Monnich, U. and Morawietz, F. (2000a). An op- erational and denotational approach to non-context-freeness. To appear in: Theoretical Computer Science, Elesevier Science. Draft available under http://tcl.sfs.uni-tuebingen.de/~frank/. Kolb, H.-P., Monnich, U. and Morawietz, F. (2000b). Descriptions of cross-serial dependencies, Grammars 3(2/3): 189{216. Michaelis, J., Monnich, U. and Morawietz, F. (2000). On minimalist attribute grammars and macro tree transducers. To appear in: Linguistic Form and its Computation, Ch. Rohrer, A. Rossdeutscher, H. Kamp (eds), CSLI. Draft available under http://tcl.sfs.uni-tuebingen.de/~frank/.