BOLT Publications

Collection

Ann Bies, Zhiyi Song, Mohamed Maamouri, Stephen Grimes, Haejoong Lee, Jonathan Wright, Stephanie Strassel, Nizar Habash, Ramy Eskander, Owen Rambow
Transliteration of Arabizi into Arabi Orthography: Developing a Parallel Annotated Arabizi-Arabic Script SMS/Chat Corpus
EMNLP 2014: Conference on Empirical Methods on Natural Language Processing, Doha, October 25-29
ANLP Workshop: Arabic Natural Language Processing Workshop
Available: Paper in PDF

Zhiyi Song, Stephanie Strassel, Haejoong Lee, Kevin Walker, Jonathan Wright, Jennifer Garland, Dana Fore, Brian Gainor, Preston Cabe, Thomas Thomas, Brendan Callahan, Ann Sawyer 
Collecting Natural SMS and Chat Conversations in Multiple Languages: The BOLT Phase 2 Corpus 
LREC 2014: 9th Edition of the Language Resources and Evaluation Conference, Reykjavik, May 26-31
Available: Paper in PDFPoster in PDF

Jennifer Garland, Stephanie Strassel, Safa Ismael, Zhiyi Song, Haejoong Lee
Linguistic Resources for Genre-Independent Language Technologies: User-Generated Content in BOLT
LREC 2012: 8th International Conference on Language Resources and Evaluation, Istanbul, May 21-27
Available: Paper in PDF, Slides in PDF

Word Alignment

Stephen Grimes, Katherine Peterson, Xuansong Li
Automatic Word Alignment Tools to Scale Production of Manually Aligned Parallel Texts
LREC 2012: 8th International Conference on Language Resources and Evaluation, Istanbul, May 21-27
Available: Paper in PDF

Xuansong Li, Stephanie Strassel, Stephen Grimes, Safa Ismael, Mohamed Maamouri, Ann Bies, Nianwen Xue
Parallel Aligned Treebanks at LDC: New Challenges Interfacing Existing Infrastructures
LREC 2012: 8th International Conference on Language Resources and Evaluation, Istanbul, May 21-27
Available: Paper in PDF

Treebank

Ann Bies, Justin Mott, Seth Kulick, Jennifer Garland, Colin Warner
Incorporating Alternate Translations into English Translation Treebank
LREC 2014: 9th Edition of the Language Resources and Evaluation Conference, Reykjavik, May 26-31
Available: Paper in PDFPoster in PDF 

Ann Bies, Denise DiPersio, Mohamed Maamouri
Linguistic resources for Arabic machine translation, The Linguistic Data Consortium (LDC) Catalog
In Abdelhadi Soudi, et al., Challenges for Arabic Machine Translation
Available: John Benjamins Publishing Company

Seth Kulick, Ann Bies, Justin Mott
Further Developments in Treebank Error Detection Using Derivation Trees
LREC 2012: 8th International Conference on Language Resources and Evaluation, Istanbul, May 21-27
Available: Paper in PDFPoster in PDF

Seth Kulick, Ann Bies, Justin Mott
Using Supertags and Encoded Annotation Principles for Improved Dependency to Phrase Structure Conversion
NAACL-HLT 2012: The 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Montreal, June 3-8
Available: Paper in PDFPoster in PDF

Mohamed Maamouri, Ann Bies, Seth Kulick
Expanding Arabic Treebank to Speech: Results from Broadcast News
LREC 2012: 8th International Conference on Language Resources and Evaluation, Istanbul, May 21-27
Available: Paper in PDFPoster in PDF

PropBank

Claire Bonial, Kevin Stowe, Martha Palmer 
Renewing and Revising SemLink 
GenLex-2013: the GenLex Workshop on Linked Data in Linguistics, Pisa, Italy Sept 2013
Available: Paper in PDF

Co-reference

Sameer Pradhan, Alessandro Moschitti, Nianwen Xue, Olga Uryupina, and Yuchen Zhang
CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes
CoNLL-2012: the Sixteenth Conference on Computational Natural Language Learning: Shared Task. July 2012
Available: Paper in PDF 

Annotation Tool

Jonathan Wright
RESTful Annotation and Efficient Collaboration
LREC 2014: 9th Edition of the Language Resources and Evaluation Conference, Reykjavik, May 26-31
Available: Paper in PDF

Jonathan Wright, Kira Griffitt, Joe Ellis, Stephanie Strassel, Brendan Callahan
Annotation Trees: LDC's Customizable, Extensible, Scalable Annotation Infrastructure
LREC 2012: 8th International Conference on Language Resources and Evaluation, Istanbul, May 21-27
Available: Paper in PDF