Steven Bird   PhD(Edin), AMusA      sb@csse.unimelb.edu.au


 
Associate Professor & Deputy Head
Dept of Computer Science
  and Software Engineering

University of Melbourne
Victoria 3010, Australia

http://www.csse.unimelb.edu.au/~sb/
  Senior Research Associate
Linguistic Data Consortium
University of Pennsylvania
3600 Market Street, Suite 810
Philadelphia, PA 19104-2653, USA
http://www.ldc.upenn.edu/sb/

Office: L6.07, 111 Barry St, Carlton; Phone: +61 3 8344 1361; LinkedIn; facebook; public key
I am investigating computational models for linguistic structures and processes, with application to language technologies and to the documentation of endangered languages. My current focus is on efficient query for databases of hierarchically annotated data. After completing a PhD on computational phonology at the University of Edinburgh in 1990, I worked on a series of European research projects and conducted linguistic fieldwork in Cameroon with SIL. In 1998 I moved to the University of Pennsylvania, becoming Associate Director of the LDC, and working on models and tools for linguistic annotation. In 2002 I returned home to Australia and established the Melbourne University Language Technology Group. In 2007 I was awarded the Kelvin Medal for excellence in teaching. In 2008 I am vice president of the Association for Computational Linguistics.
Key Activities: Coordinating first year Informatics; developing the Natural Language Toolkit; writing a textbook on NLP; leading the Language Technology Group.
Key Publications: Natural Language Processing in Python; Designing and evaluating an XPath dialect for linguistic queries (ICDE). Seven dimensions of portability for language documentation and description (Language); A formal framework for linguistic annotation (Speech Communication); Computational phonology: A constraint-based approach (Cambridge);

Projects and Activities
  • Melbourne University Language Technology Group
  • NICTA Information Discovery Project
  • Querying Linguistic Databases
  • OLAC: Open Language Archives Community
  • NLTK: Natural Language Toolkit
  • AGTK: Annotation Graph Toolkit
  • E-MELD: Preserving Endangered Languages
  • PARADISEC: Pacific And Regional Archive for Digital Sources in Endangered Cultures
  • LanguageLog: A Weblog about Language
  • ARC Network in Human Communication Science
  • Proposed student projects in language technology and digital archives
  • Bird family history
  • Research Funding

    External sponsorship for work on linguistic annotation, digital archives, and language documentation:

  • OLAC: Accessing the World's Language Resources
  • Querying Linguistic Databases
  • The Rosetta Project: ALL Language Archive
  • E-MELD: Electronic Metastructure for Endangered Languages Data
  • TalkBank: A Multimodal Database of Communicative Interaction (Previous grant)
  • Quadriga System for Research Archive of Asia-Pacific Region Audio Recordings (ARC-LE0346848)
  • Natural Language Engineering: Integrating Parametric and Parallel Processing (2003)
  • Media Stories
  • ICT Update: Language technology
  • Sydney Morning Herald: Big brains coming back to Melbourne
  • Uni News: Reversing the brain drain
  • BBC: Digital race to save languages
  • Uni News: Global mission races to save dying languages
  • ABC Radio National: Indigenous Language
  • Wired News: Word Up: Keeping Languages Alive
  • Scientific American: Saving Dying Languages
  • ABCNews.com: Tongue-Tied
  • National Public Radio: Endangered Languages
  • Internet Week: Ellis Island Project
  • Avotaynu: Ellis Island Project
  • Boston Globe: Vanishing tongues
  • Recent Publications
  • Natural Language Processing in Python, (in preparation).
  • Graphical query for linguistic treebanks, 10th Conference of the Pacific Association for Computational Linguistics.
  • Managing Fieldwork Data with Toolbox and the Natural Language Toolkit, Language Documentation and Conservation.
  • NLTK: The Natural Language Toolkit, Proceedings of the ACL Interactive Demonstration Session.
  • Building a Search Engine to Drive Problem-Based Learning, 11th Annual Conference on Innovation and Technology in Computer Science Education.
  • Designing and Evaluating an XPath Dialect for Linguistic Queries, 22nd International Conference on Data Engineering.
  • Transforming Access to the Spoken Word, Journal of Digital Libraries 5.
  • more publications...
  • Invited Talks
  • First Brazilian School on Computational Linguistics, September 2007, São Paulo, Brazil.
  • LSA Summer Institute: Introduction to Computational Linguistics, July 2007, Stanford, USA
  • MSR India Summer School on Natural Language Processing, May 2007, Bangalore, India
  • Texas Linguistics Society 10: Computational Linguistics for Less-Studied Languages, Nov 2006, UT Austin, USA
  • German Linguistics Society Annual Meeting, Feb 2006, U Bielefeld, Germany
  • Intl Conf on Linguistic Evidence, Feb 2006, U Tübingen, Germany
  • Teaching
  • 433-460: Human Language Technology
  • 433-253: Algorithms and Data Structures
  • 175-410 Computational Linguistics
  • 433-351: Database Systems (2003)
  • CIS 530: Computational Linguistics (1999-2001, Penn)
  • Chair, CSSE Academic Programs Committee
  • Departmental Teaching Excellence Award, 2005
  • Appointments
  • Association for Computational Linguistics (vice president)
  • Cambridge Studies in NLP (editor)
  • ACL Digital Anthology (editor, 2001-7)
  • SIL Equip Training, Australia (board)
  • Natural Language Engineering (editorial board)
  • Language Resources and Evaluation (editorial board)
  • Australasian Language Technology Association (president, 2003-4)
  • Association for Computational Linguistics (executive, 2001-3)
  • LSA Committee on Endangered Languages and their Preservation (2002-4)

  • Last modified: Fri Sep 28 09:26:49 EST 2007