January 2010 Newsletter

Wednesday, January 20, 2010

New Corpora

Czech Broadcast News MDE Transcripts

GALE Phase 1 Chinese Newsgroup Parallel Text - Part 2

NIST Open Machine Translation 2008 Evaluation (MT08) Selected Reference and System Translations


Newly Expanded Press Release Section

Recall reading a newsletter article about the Reduced Licensing Fee but unsure what you did with the email?   Curious as to which organization was the recipient of LDC's 15,000th corpus distribution nearly eight years ago?  If so, be sure to visit LDC's newly expanded Press Release section on our What's New! What's Free! page to read about these topics and more.  The Press Release section includes the articles of previous newsletters as well as major announcements from LDC.  Information is organized into the following categories:

    15th Anniversary Monthly Spotlight Archive - as part of our 15th Anniversary celebration in 2007, we highlighted one aspect of the LDC in our monthly newsletters. These features provided our members and data users with a glimpse of the broad range of LDC’s research activities.

    Conference Attendance by LDC - recent publisher displays and conference participation by LDC.

    Etc. - recent collaborations and grant awards plus other announcements.

    Membership Mailbag Archive - to address the questions that our data users have asked, we introduced our Membership Mailbag series of newsletter articles in May 2008. This periodic series addresses frequently asked questions about LDC data, the LDC Intranet, and the benefits of an LDC membership.

    Member Surveys - LDC conducted two end-of-year surveys to obtain feedback on satisfaction levels with LDC Membership and data releases as well as our corpus catalog, and to gather suggestions on future publications.

    Milestones and Celebrations - information on our landmark corpora distributions and events to celebrate our 10th and 15th anniversary years.
    Use of LDC Corpora in University Summer Schools - ways LDC corpora have been used for teaching purposes at university summer school programs.

The Press Release section will be updated as new announcements are made so we anticipate that this will be a great resource for information about LDC.


Upcoming LDC Institute Seminar

The LDC Institute will hold its next session on  Tuesday, January 26, 2010, from 10:00 a.m. to 12:00 p.m.  in the LDC Conference Room at LDC's Philadelphia offices, 3600 Market Street, Suite 810.

The topic of this session will be the U.S. Supreme Court Corpus (SCOTUS) presented by Daniel Katz, J.D., M.P.P., Fellow in Empirical Legal Studies, Michigan Law School, PhD Candidate, Political Science and Public Policy, University of Michigan, and Michael Bommarito, PhD Student, Political Science: Methods & Modeling, University of Michigan.

The corpus of Supreme Court written opinions is a rich linguistic resource. Not only does this corpus provide a longitudinal sample of formal American English, but it is also a source of text with identified authors and vote-coded sentiment. Despite this value and years of qualitative and quantitative material of the United States Supreme Court, no compiled corpus of these opinions is currently available to researchers. The purpose of this talk is (1) to describe efforts to compile both the complete corpus of Supreme Court Opinions and associated metadata, (2) to outline a number of our current research projects utilizing this data, and (3) to discuss any criticism, potential projects, or possible collaboration.

Refreshments will be provided. If you are in the area, we hope to see you there!