LSA 2012 Workshop

Coding for Sociolinguistic Archive Preparation, LSA 2012 Workshop


We are grateful for the funding supplied by NSF BCS Grant #1144480, which made this workshop possible. The workshop took place over January 4-5, 2012 in Portland, Oregon and preceded the LSA 2012 Annual Meeting.


Christopher Cieri, LDC
Malcah Yaeger-Dror, University of Arizona and LDC


One of the features of a unified archive would be a coherent, conjoined set of coding conventions to be shared by researchers contributing data. Sharable, comparable demographic protocols together with shared corpora enable follow-up and comparative studies. More sophisticated demographic coding will not only permit accurate cell distribution, but will insure that corpora can be also archived and shared by the larger community.

The first segment discusses the importance of metadata for sharable archives. The presenters are all in charge of large archival corpora. The presenters in the second segment are each carrying out research in communities for which the ‘ethnic’ designations common to our field are obviously inappropriate, or at best under-differentiated. The presenters in the third section are known for their interest in and study of other demographic factors which are not always coded. Discussants raise questions which will be discussed by presenters and other participants.

The recent discussion on Var-L, a sociolinguists’ online discussion forum, has pointed out the degree to which our field is primed for the development of larger archival corpora. This workshop brings together those interested in pursuing large-scale collaborative sociolinguistic research, coordinating cross-regional and cross-group coding. The groundwork laid by this workshop provides a platform for continued discussion at a Satellite Workshop at the Winter LSA.


  1. to catalog the need for more detailed demographic categories based on field experience that other researchers can and should exploit both for their own immediate analyses, and to facilitate sharing among research groups
  2. to encourage the use of a core set of demographic metadata coding options


  1. Introduction
    1. Cieri: paper, slides
    2. Yaeger-Dror: paper, slides
  2. Large Corpora/Metadata
    1. MacWhinney: paper, slides
    2. Simons: slides
    3. Discussant: Shen: slides
  3. Human Subjects' Protection
    1. Warner: paper, slides
    2. DiPersio: paper, slides
    3. Discussant: MacWhinney
  4. Demographic Coding
    1. Blake: paper, slides
    2. Wong & Hall-Lew: paper, slides
    3. Fought: paper, slides
    4. Eckert: paper
    5. Bowie: slides
    6. Discussant: Poplack
  5. Coding for Social Attitudes
    1. Llamas: paper, slides
    2. Nagy: paper, slides
    3. Discussant: Noels: paper, slides
    4. Discussant: Poplack
  6. Day 1 Recap
    1. Cieri: slides
  7. Coding Social Situtation
    1. Tagliamonte: paper, slides
    2. Rickford
    3. Discussant: Llamas: slides
  8. Next Steps
    1. Simons: slides

Workshop Links

LSA (Linguistic Society of America)
Let's Go Dialog System