What's in a Word?
The Why's and What For's of a Nahuatl Dictionary:
The Nahuatl Learning Environment Project

Jonathan D. Amith, Yale University

1460 James Howe Rd. Dallas, OR 97338, USA
jonathan.amith@yale.edu | www.yale.edu/nahuatl


The Nahuatl Learning Environment represents an effort to link lexicon, grammar, and corpus into a multifaceted and multimedia system for research and learning (see project overview [postscript|pdf]). It aims to combine two research projects (a dictionary and a reference/pedagogical grammar) while making available to students and scholars the primary field data that were used in the elaboration of each. The project is based on three fundamental premises. The first is that for both to be fully functional, grammars and dictionaries must be used in conjunction, the first providing the grammatical context for using dictionary material while the second furnishes the appropriate lexical base to operationalize a learned grammar. The second is that any lexicographic project should make an effort to meet the needs of the greatest number of potential users. With Nahuatl this is a particularly challenging problem since this language is of interest to highly disparate groups: linguists, historians, anthropologists, theater and dance troupes, native and heritage language speakers, and simply the curious. The final premise is that given the ongoing nature of linguistic research and the changing paradigms of analysis, as well as the impossibility for one researcher to concisely and accurately analyze a given language, as much primary data as possible should be made available to interested scholars and students. Electronic presentation and manipulation of the data (corpus) and analysis (lexicon and grammar) through the internet is a key element in fulfilling the project's goals.

The Nahuatl Learning Environment originated in an effort to elaborate a trilingual dictionary (Nahuatl to Spanish and English) based on approximately four and a half years of ethnographic fieldwork in two neighboring Nahuatl-speaking communities located near the Balsas River in central Guerrero, Mexico (see map): Ameyaltepec (3 years) and San Agustín Oapan (1 1/2 years). The initial corpus comprised approximately 20,000 filecards (containing headwords and accompanying phrases) of unelicited utterances from Ameyaltepec; another 2,000 cards were later elaborated in Oapan. Approximately 100 hours of recordings from various villages will eventually yield interlinearized texts that will be incorporated into the electronic database as primary source material. The present schedule targets December 2000 for the completion of lexical data entry (an estimated 10,000 headwords) into an electronic database format. The following year I will complete the first draft of a reference/pedagogical grammar and transcribe many of the recorded tapes. Given that original fieldwork was oriented not to linguistic goals but to ethnographic research, additional fieldwork with native speakers in Ameyaltepec and Oapan will be necessary to enhance the corpus, achieve more precise definitions for the lexicon, and refine the grammatical analysis.

In the lexicon being developed, entries have been organized according to various fields, within which a coding system has been used to facilitate the retrieval of certain categorical information (e.g., intransitive verbs of particular morphological shapes, kinship terms, incorporated nouns, causatives). Eventually a parsed primary corpus will further facilitate searches and retrieval of information. An extensive cross-referencing system has also been developed and put in place to assist users in tasks such as following derivational processes and in locating stems. For example, verbs are coded according to valency and morphology. Thus, koto:ni `to snap' is an intransitive verb, that manifests "nondirected alternation" (the term is taken from Haspelmath), and that ends in -ni (see example verb entry [postscript|pdf]; the underlying code used for koto:ni is V-1-nondir-ni/na). The database format used permits the easy retrieval of all verbs with a similar morphologhical structure and syntactic function. Links can also be easily established to the interlinearized corpus (which, for example, may be searched for a concordance of koto:ni, or any other lexeme, stem, category, tense or aspect, etc.). Likewise, when using the reference/pedagogical grammar, at a point where such verbs (in this case, those manifesting nondirected alternation) are discussed, a hypertext link could be embedded that would generate all verbs in this category. Thus those studying a particular form may instantly access those words that exemplify it (see the work-through exercise described below in relation to the verb poliwi).

Additional examples of the project structure can be gleamed from schematic representation of the noun kuhtekomatl (see example noun entry [postscript|pdf]). Links will be established to an encyclopedic text that will include extensive documentation of an ethnographic and linguistic nature: plant uses, a trilingual visual dictionary, onomasiological studies that explore shades of meaning difference among near synonyms, sound files with texts of particular interest (e.g., ritual speech), among many other items. For compounds such as kuhtekomatl, users will be able to generate searches producing a root dictionary (see the work-through example with the root a: `water' below).

This discussion has suggested how various elements of a wide-ranging field project can be linked in a manner that enhances the overall research and pedagogical value of all facets of exploration: 1) from the corpus, occurrences of words can be activated to link to the dictionary entry; 2) from the lexicon, a concordance can be generated in the corpus; 3) within the lexicon lists and tables of words of similar form and function can be generated, or links can be established to the relevant sections of the grammar; 4) from the grammar, illustrative material can be extracted from the lexicon or corpus; and 5) from the lexicon links may be established to a more encyclopedic approach to cultural and linguistic data in need of greater elaboration and exegesis. These five points suggest a final and basic element of the nature of the Nahuatl Learning Environment: to use a linked lexical and grammatical study to provide the foundation for presenting and articulating anthropological and linguistic studies in the widest sense of these terms.

Postscript:
Work-through examples of links between grammar and lexicon
from a prototype of the Nahuatl Learning Environment project

For the way in which grammar and lexicon can be linked, follow the following steps:

  1. Go to www.yale.edu/nahuatl
  2. Activate the link at the bottom of the page: Lessons & Exercises
  3. Enter chapter 3, "Intransitivity and Nahuatl Verbs"
  4. Scroll down to table 3.a, Intransitive and Transitive Pairs of Verbs
  5. Activate the link on poliwi

This will generate a table of all verbs, 22 at present, that have intransitive/transitive forms distinguished by the endings -iwi and -owa, respectively. Scroll to the bottom of the page to see the query submitted via Hyperlex to the Nahuatl lexicon database. The display may be changed. For example, change to Español complete and resubmit the search. This will generate a table of the same verbs in Spanish, though now all fields of the Spanish database (including illustrative sentences) will be displayed.

For the way in which learning can be facilitated in the Nahuatl Learning Environment:

  1. Go to www.yale.edu/nahuatl
  2. Activate the link at the bottom of the page: Lessons and Exercises
  3. Enter exercise 2, for the lesson on Phonology, Orthography, Accent, and Syllable Structure
  4. Scroll down in the middle frame till #6, chikwëi, is visible

  5. In the second column (first blank box), for Classical orthography, write chicuei; in the second column, for Jesuit orthography, also write chicuei. Leave the final column, for meaning, blank, since you don't know the translation. Click the red button on the side to resubmit.
  6. You should generate an angel for the correct answer in the first column; death, for the wrong answer in the second column; and death, for the blank left in the final column.
  7. Jesuit orthography records vowel length, so correct the second column to chicue:i (for now the sequence V: is used to indicate length). Resubmit with the red button.
  8. Now, to find out the meaning, click on the word itself (leftmost column). This will search the lexicon for the corresponding headword entry.
  9. Choose `eight' as the meaning and type into the rightmost column. Resubmit.
  10. The answers should be correct.

Note that this format is particularly useful for teaching orthography and in linking vocabulary used in lessons and exercises to the lexicon. It is also useful in teaching parsing. For example, go to the page with the lessons and exercises and select exercise 3, for the lesson on intransitivity and Nahuatl verbs. In exercise 3, scroll to the middle set of problems, the table that begins with `to cry.'

  1. In problem #5, `to be in a hurry' parse the verb tisiwin as t(i)-siwi-n and type in the meaning `we are in a hurry.' Submit.
  2. The answer is wrong. This is because the verb is /i/-initial. Retype t(i)-isiwi-n and resubmit.

Exercises like this can be useful for learning both grammar and vocabulary. They are particularly easy to design for queries that have one unique and correct answer. For problems such as the parsing and translation of clauses or phrases, there is greater difficulty in providing a means to indicate correct answers.

For the ability of the Nahuatl lexicon to generate and display data, go back to the Nahuatl home page at www.yale.edu/nahuatl and access Dictionary at the bottom of the page.

This will take you to a tutorial for Steven Bird's Hyperlex search engine that he has adapted to the Nahuatl lexicon. The aim of the tutorial is to teach the basics of this program to potential users of the lexicon. However, certain queries have been embedded in the text in order to demonstrate the power of this search engine when combined to a lexicon organized according to a database format.

  1. In the tutorial, search for the number "174." This should find a phrase that reads "This search should generate 174 hits."
  2. Activate the link. This will generate a table of all occurrences of the stem /a:/ `water' in the words that have been entered to date.
  3. The left-hand column lists the stems and morphemes, the right-hand column the words and meanings. Again, as always the user can change the display by altering the parameters set in the query box (found at the end of the display of 174 words).


Linguistic Exploration Workshop