Hierarchical data and the derivational relationship between words Andrew Hippisley, Mariam Tariq University of Surrey The experimental Node DataBlade (Brown 2001) is a software bundle that extends the functionality of the object-relational database system Informix by supporting the storage and manipulation of hierarchical data. We show how this functionality provides a way of capturing specifically inheritance relationships between linguistic data. One area where inheritance plays a key role in the description of a set of linguistic expressions is in derivational morphology. The partial derivational family of shkola 'school' can be naturally represented hierarchically in the (partial) Shkola hierarchy: 1. Shkola 1.1 shkol'nik 'pupil' 1.1.1 shkol'nicheskij 'schoolboyish' 1.1.1.1 shkol'nichesko 'in a schoolboyish manner' 1.2 shkol'nyj 'school' (adj) 1.3 shkolit' 'train' The Node DataBlade stores node identifiers using the Dewey Decimal Scheme and allows searches through the branches of the hierarchy using standard sql. In our example the status of members of a derivational family can be queried. We can create various queries to identify the relationship between members of a derivational family, and queries to check the productivity of a given family, i.e how many words does it derive. For example we can query all the siblings of the Shkol'nik node to elicit all the co-derivatives of shkol'nik, namely shkol'nyj and shkolit'; and all the children of the same node to generate its derivative shkol'nicheskij. We can also query a hierarchy for the number of nodes it has, giving a measure of its lexical productivity. The table compares the Shkola hierarchy and a separate Slovo 'word' hierarchy, where the 'Total' row gives the total number of family members. Information about the distribution of the members is also provided. For Shkola over 50% of members occur at level 3, i.e. are derivatives of derivatives of the root word. No. of items at given level Depth Slovo Shkola Level 1 1 1 Level 2 6 4 Level 3 5 9 Level 4 1 1 Total 13 15 The hierarchical functionality of the Node DataBlade provides an elegant way of capturing morphological relationships between words by encoding derivational relations as hierarchical relations. To populate such a database we could use the output of word formation rules, for example the computable model proposed in Hippisley (2001). Amongst other things questions about lexical productivity and frequency effects (Shreuder and Baayen 1997) could be addressed, and etymological information could be elicited: older words are more likely to have larger derivational families (Dixon 1982).