Obtaining DataUsing DataProviding DataCreating Data
About LDCMembersCatalogProjectsPapersLDC OnlineSearchContact UsUPennHome

LDC Catalog | By Type and Source | By Year | Top Ten | Projects | Catalog Search



North American News Text Supplement

Item Name: North American News Text Supplement
Authors: Robert MacIntyre
LDC Catalog No.: LDC98T30
ISBN: 1-58563-137-X
Data Type: text
Data Source(s): newswire
Project(s): EARS, GALE, Hub4, TIDES
Application(s): information retrieval, language modeling
Language(s): English
Language ID(s): eng
Distribution: 1 DVD
Member fee: $0 for 1998 members
Non-member Fee: N/A (Members Only)
Reduced-License Fee: N/A
Extra-Copy Fee: US $200.00
Member License: yes
Online documentation: yes
Licensing Instructions: Subscription Members, Standard Members, Non-Members
Citation: Robert MacIntyre
1998
North American News Text Supplement
Linguistic Data Consortium, Philadelphia

Introduction

This release of North American News Text provides a supplement to the LDC's earlier publication of similar materials (LDC95T21: North American News Text Corpus). The same TIPSTER-style SGML markup is used in formatting the data. The data sources are as follows:


    Source			   Dates 	  Approx. # Words 
				   Covered	  (Millions)
    -------------------------------------------------------
    Los Angeles Times &	   09/97-04/98	   11
    Washington Post

    New York Times News	   01/97-04/98	  116
    Syndicate

    Associated Press	   11/94-04/98	  143
    World Stream English
    -------------------------------------------------------
    

The previous North American News release included prior materials from both the LA Times/Washington Post and the New York Times; this supplement provides the continuation of those sources.

Data

The LDC has been collecting the Associated Press Worldstream newswire service in six languages since 1994. The is the first release of the English language portion of this service. The material in this set is typically NOT North American in origin -- the reporters who provide the stories may or may not be American born, but the locations and topics covered are much more heavily international in comparison to the North American wire services. Reports from Asia, Africa and Europe are found here that show up only rarely or not at all in North American newspapers, including political, financial and sports stories that are presumably geared to English-speaking readers in those parts of the world.

This release, when combined with the LDC's earlier NA News Text Corpus, constitutes all the English-language newswire text collected by the LDC between January 1994 and April 1998, inclusive.

Updates

There are no updates at this time.

Copyright

Portions © 1994-1998 The Associated Press, © 1997-1998 Los Angeles Times - Washington Post News Service, Inc., © 1997-1998 New York Times, © 1998 Trustees of the University of Pennsylvania

Pricing

The Reduced Licensing Fee for this corpus is US$200.


About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Data

Contact: ldc@ldc.upenn.edu

(c) 1992-2010 Linguistic Data Consortium, University of Pennsylvania. All Rights Reserved.