January 2013 Newsletter

Tuesday, January 15, 2013

New Corpora

Chinese-English Biology and Chemistry Abstract Parallel Text

GALE Phase 2 Arabic Web Parallel Text



2013 LDC Podcast Available from LDC Blog

Kicking off the new year is the fourth podcast in our 20th Anniversary series featuring LDC Senior Researcher, Mohamed Maamouri.

Mohamed directs the Arabic Treebank group and spearheads the development of Arabic resources and projects. The latter includes the leading role in LDC’s collaboration with Georgetown University Press to develop updated versions of three dialectal Arabic dictionaries (Iraqi, Moroccan, Syrian). In this podcast, he reflects on his personal and professional experiences and comments on Arabic resource development at LDC.

Other podcasts will be published via the LDC Blog , so stay tuned to that space.

Membership Discounts for MY 2013 Still Available

If you are considering joining for Membership Year 2013 (MY2013), there is still time to save on membership fees.  Any organization which joins or renews membership for 2013 through Friday, March 1, 2013, is entitled to a 5% discount on membership fees. Organizations which held membership for MY2012 can receive a 10% discount on fees provided they renew prior to March 1, 2013.  For further information on pricing, please consult our Announcements page or contact LDC.

Penn Discourse Treebank Version 2.0 Update - RTE data

A Recognizing Textual Entailment (RTE) update is now available for Penn Discourse Treebank Version 2.0 LDC2008T05 (PDTB).  This data has been used to run the textual entailment experiments described in: Sara Tonelli and Elena Cabrio "Hunting for Entailing Pairs in the Penn Discourse Treebank", in Proceedings of Coling 2012, Mumbay, India. The files contain Text - Hypothesis pairs in the standard RTE xml format (for more details, see  RTE Challenge at TAC 2011), which have been manually annotated as entailing or not entailing. All sentence pairs have been extracted from the Penn Discourse Treebank and are therefore connected by a discourse relation label.

The data are not included in the general release of Penn Discourse Treebank Version 2.0, but are freely available for download from the catalog page.