Multidimensional Exploration of Online Linguistic Field Data

Steven Bird, LDC, University of Pennsylvania

This page is intended to be used in conjunction with the hardcopy version of the paper, which appeared in the proceedings of NELS-29 (GLSA). Section, example and figure numbers are all keyed to the paper. Click on the examples to listen to them, and use the HyperLex and Paradigm links to query the online speech corpora. In many cases, the larygnograph data can be heard by clicking on the tone transcription. The more complex queries are only recommended for use over a high-bandwidth connection to the internet.

Abstract

Advances in storage technology make it possible to house virtually unlimited quantities of recorded speech data online. Advances in character-encoding technology make it possible to create platform-independent transcriptions. Advances in Web technology make it possible to publish this data for essentially no marginal cost. These developments have profound consequences for the accessibility, quality and quantity of linguistic field data. Recordings become accessible. Transcriptions become verifiable. Large corpora become manageable. In order to illustrate the potential for this mode of operation in field linguistics, I describe a piece of online fieldwork involving a tone language of Cameroon. A complex verb paradigm for Bamileke Dschang has been collected and transcribed, and audio and laryngograph recordings have been digitised and segmented. A central insight of Hyman's analysis concerning the domain of tone rules has been applied to the new data. And a Web program for multidimensional exploration of the data has been developed. These three lines of inquiry - primary description, theoretical analysis, and tool development - are synthesised. What emerges is a new methodology for the investigation of linguistic field data.

Note: This page will take some time to download because of all the fonts and tables. From here on in you may want to have a hardcopy of the paper to hand. Please scroll up to get the download pointers.

2 An empirical challenge: tone in Bamileke Dschang

Note that this example, like many others, contains hyperlinks to speech files and embedded database queries (on the right).

Example 1: An Illustration of Lexical Tone
a. H feather
b. HL reading
c. LH navel
d. L finishing
HyperLex:
words with root=tON
more minimal sets

Note that forms (1c) and (1d) are homophonous in the speech of many informants, including this one. Evidently the distinction between final low and final low-falling is being lost.

Example 2: An Illustration of Grammatical Tone
a. the chief buried dogs (immediate past)
b. the chief buries dogs (simple present)
c. the chief will bury dogs (immediate future)
Paradigm: look up these forms | expand to all tenses and cover both verb tones

Note that the lexical content is constant, but that the tone differences communicate verb tense. Click on the pitch transcriptions to hear the larynx recording.

Example 3: An Illustration of Tonal Alternations
a. L!H chief of dogs
b. !HH the chief will bury dogs (immediate future)
c. H!H the chief will cover dogs (immediate future)
d. H!L will the chief cover dogs? (immediate future)
Paradigm: lots of sentences with membhU, classified in terms of the final 3 pitches

Observe the four different tonal forms of `membhU' (dogs).

3 Constructing tone paradigms and putting them online

Example 4: Subject Nouns for the Verb Paradigm
a. H+L lazy man e. H+H lazy men
b. HL+L poor man f. HL+H poor men
c. LH+L cowife g. LH+H cowives
d. L+L chief h. L+H chiefs
HyperLex:
H+L H+H
HL+L HL+H
LH+L LH+H
L+L L+H

Note that sound files are only available for singular forms.

Example 5: Object Nouns for the Verb Paradigm
a. L+H thieves e. H bird
b. L+HL dogs f. HL child
c. L+LH roosters g. LH squirrel
d. L+L leopards h. L animal
HyperLex:
H
HL
LH
L

Figure 1: A Tense-Based Slice Through the Verb Paradigm, for Indicative Mood
High tone verb: kapte cover Low tone verb: kemte bury
P5
P3
P2
P1
PR
PP
F1
F4
F5
Paradigm: indicative | negative | interrogative | conditional | focus

Figure 2: A Noun-Based Slice Through the Verb Paradigm, for F1 Interrogative
Varying object nouns Varying subject nouns
L
LH
HL
H
L
LH
HL
H
Paradigm: varying object nouns | varying subject nouns

4 Downstep in Bamileke Dschang

4.1 Downstep conditioned by low tone

Example 6: Downstep conditioned by low tone
a. chief of thieves
b. tail of thieves

Example 7: Neutralisation of high and downstepped high
a. tail of dogs
b. tail of thieves

Figure 5: Yesterday Past Indicative for High Tone Verbs with Prefixless Object Nouns
Indicative Conditional
L
LH
HL
H
Paradigm: search for these forms | remote future indicative with high tone verbs and prefixed objects

Note that the recordings of the conditional forms include an additional phrase, to complete the conditional construction. For example, in the top right form, the full sentence means If the chief covered (yesterday) the animal, I will thank him.

4.2 Downstep conditioned by high tone

Example 8: !H/!L alternation for puN
a. poor men bury thieves
b. the poor man buries thieves

Example 9: L/!L alternation for menzwi
a. chief of leopards
b. stool of leopards

Example 10: !H/!L alternation in possessive forms
a. horn
b. my horn
HyperLex:
look up this word
other nouns which behave similarly

Example 11: A kind of !L which only shows up after L tone
a. stool of leopards
b. tail of leopards

Figure 6: Simple Present Indicative Varying Subject and Verb
Indicative Conditional
L
LH
HL
H
L
LH
HL
H
Paradigm: search for these forms

4.4 Towards an inventory of domain types

As in many other places, click on the pitch transcriptions to listen to the laryngograph recording.

Figure 7: Pitch Transcriptions for @fO ... m@tsON
Indicative Negative
H verb: kapte L verb: kemte H verb: kapte L verb: kemte
P5
P3
P2
P1
PR
PP
F1
F4
F5
Paradigm: search for these forms

Figure 8: Tense and Verb-Tone Classified by Domain Boundary Type

This figure is not reproduced since there is no speech data to display. Follow the links below to see how the paradigm system can approximate the table in the paper.
Paradigm: indicative | negative


Informants

The transcriptions and recordings which appear on this page come from Pierre Ngogeo and Albert Tsomejio.

Pierre Ngogeo was born in 1938 in the Mbeng neighbourhood of Bafou and has lived there for his whole life, like his parents before him. His only extended absence from the village was a period in Mbouda (1954-60), 30km NE of Bafou. Pierre completed secondary education and has a teaching diploma. He teaches in a primary school in Mbeng. He is bilingual in Dschang and French, and literate in both languages. Pierre was recorded on 2 June 1997, in Ntsingbeu, Bafou, at the house of Nancy Haynes and Gretchen Harro (SIL) in the compound of chief Ntsala'. His recordings appear in the verb paradigm and the associative construction.
Albert Tsomejio was born in 1964 in the Aga neighbourhood of Bafou and has lived there for his whole life. His mother was from the same neighbourhood, and his father was from Mengala' (Bafou). His only extended absence from the village was a period in Dschang (1983-88), 5km W of Bafou. Albert completed secondary education and a year of tertiary education. He teaches Dschang literacy classes in Bafou. He is bilingual in Dschang and French, and literate in both languages. Albert was recorded on 9 May 1997, at the SIL recording studio in Yaoundé. His recordings appear in all the lexical items.

Acknowledgements

I am grateful to Will Leben and Mark Liberman for their comments on an earlier version of this paper; I assume full responsibility for any oversights and errors it may contain. Nancy Haynes and Gretchen Harro, SIL linguists working in Bafou since 1983, unwittingly stimulated this work in their 54-page, musically transcribed verb paradigm. They also helped identify good informants, permitted me to use their village home on several occasions, and injected an uplifting mixture of sage advice and good humour. Special thanks go to Pierre Ngogeo, a retired teacher of Bafou, whose knowledge of Dschang grammar and whose ability to produce all manner of verb forms have been a major asset. This research was funded by a grant from the UK Economic and Social Research Council to Edinburgh University; it was carried out under the auspices of SIL Cameroon; and it was covered by research permits with the Ministry of Scientific and Technical Research of the Cameroon government.

Links and Images

Click on the image to see an enlarged version.

Chief Kana, paramount chief of Bafou, seated on his throne. The dictionary team: Tsomejio, Momo, Kouesso, Tadadjeu, Kenfack, Bird, Métangmo. With a traditional chief, who is delighted to be holding a newly published copy of the Dschang dictionary.


This page has had hits since 17 March 1999. It was last updated on 17 March 1999.