Obtaining DataUsing DataProviding DataCreating Data
About LDCMembersCatalogProjectsPapersLDC OnlineSearchContact UsUPennHome

LDC Catalog | By Type and Source | By Year | Top Ten | Projects | Catalog Search



CCGbank

Item Name: CCGbank
Authors: Julia Hockenmaier and Mark Steedman
LDC Catalog No.: LDC2005T13
ISBN: 1-58563-340-2
Release Date: May 15, 2005
Data Type: text
Data Source(s): newswire
Project(s): GALE, TIDES
Application(s): automatic content extraction, cross-lingual information retrieval, information detection, natural language processing
Language(s): English
Language ID(s): eng
Distribution: Web Download
Member fee: $0 for 2005 members
Non-member Fee: US$600.00
Reduced-License Fee: US$300.00
Extra-Copy Fee: N/A
Non-member License: yes
Online documentation: yes
Licensing Instructions: Subscription Members, Standard Members, Non-Members
Citation: Julia Hockenmaier and Mark Steedman
2005
CCGbank
Linguistic Data Consortium, Philadelphia

Introduction

CCGbank is a translation of the Penn Treebank into a corpus of Combinatory Categorial Grammar derivations. It pairs syntactic derivations with sets of word-word dependencies which approximate the underlying predicate-argument structure.

Data

CCGbank contains 99.44% of the sentences in the Penn Treebank, for which it corrects a number of inconsistencies and errors in the original annotation.

Samples

For an example of this corpus, please examine this sample.

Update

The current version, 1.1, is a bug fix that supersedes the old package. It is available for download.

Content Copyright

Portions © 2005 Julia Hockenmaier and Mark Steedman, © 2005 The Trustees of the University of Pennsylvania.


About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Da ta

Contact: ldc@ldc.upenn.edu

(c) 1992-2008 Linguistic Data Consortium, University of Pennsylvania. All Rights Reserved.