TREX-Q: A query language based on XML Schema Brad Penoff, CIS, The Ohio State University, and Sun Microsystems Ltd. Chris Brew, Linguistics, The Ohio State University James Clark's TREX (http://www.thaiopensource.com) is a clean, simple and powerful schema language for XML that is designed to complement a datatyping language such as that described in XML Schema Part 2: Datatypes (http://www.w3.org/TR/xmlschema-2/). We have designed and partially implemented a query language (called TREX-Q) based on TREX, but borrowing some of the features of McKelvie's XMLQUERY (http://www.ltg.ed.ac.uk/~dmck/xmlstuff/xmlquery/index.html). TREX has spawned a closely-related successor: RELAX-NG (http://www.oasis-open.org/committees/relax-ng/). In this abstract we sometimes use the term TREX to stand for both RELAX and TREX: though our actual implementation work was based on TREX proper. TREX is implemented, but the provided application is just a validator, returning a single bit of information. For corpus exploration mere validation is insufficient: we also require the ability to return structured results. In other words, we require a query language. We took several ideas from XMLQUERY.McKelvie's contribution is threefold. Firstly,he provides a concise and usable syntax for queries. Secondly, XMLQUERY is the distillation of several years' experience of implementing the things that linguists and dialogue annotators who have been exposed to XML ask for next. Finally, McKelvie specifies a non-deterministic virtual machine for corpus query which shows how to reconcile the language's power with efficient execution over corpora too large to hold in memory, for which streamed access is the only practical option. Our implementation strategy has been to use the code for James Clark's TREX validator (http://www.thaiopensource.com/trex/jtrex.html) as the basis for extensions in the direction of XMLQUERY. The input to our program is a standard TREX schema augmented with special elements that control what is effectively a TREX-Q interpreter. Other approaches are possible. For example, Okajima's RELAX-NGCC (http://homepage2.nifty.com/okajima/relaxngcc/index_en.htm) which is a Yacc-like compiler from annotated RELAX-NG to efficient standalone Java source code. TREX is far from concise, so we do not aspire to reproduce the syntactic neatness of XMLQUERY in TREX-Q. In actual use we anticipate that many users will generate queries either by the use of a graphical "query by example" interface or by using a more concise user-level syntax. At the workshop we will present the thinking behind our design and demonstrate the current state of TREX-Q. Our hope is that TREX-Q combines the elegance of McKelvie's query language design with the clarity and precision of the TREX formalism.