Obtaining DataUsing DataProviding DataCreating Data
About LDCMembersCatalogProjectsPapersLDC OnlineSearchContact UsUPennHome

LDC Catalog | By Type and Source | By Year | Top Ten | Projects | Catalog Search



Arabic English Parallel News Part 1

Item Name: Arabic English Parallel News Part 1
Authors: Several
LDC Catalog No.: LDC2004T18
ISBN: ISBN 1-58563-310-0
Release Date: Oct 26, 2004
Data Type: text
Data Source(s): newswire
Project(s): GALE, TIDES
Language(s): English, Modern Standard Arabic
Language ID(s): ARB, ENG
Distribution: Web Download
Member fee: $0 for 2004 members
Non-member Fee: US $3000.00
Reduced-License Fee: US $1500.00
Extra-Copy Fee: N/A
Non-member License: yes
Online documentation: yes
Licensing Instructions: Subscription Members, Standard Members, Non-Members
Citation: Several
2004
Arabic English Parallel News Part 1
Linguistic Data Consortium, Philadelphia

This corpus contains Arabic news stories and their English translations LDC collected via Ummah Press Service from January 2001 to September 2004. It totals 8,439 story pairs, 68,685 sentence pairs, 2M Arabic words and 2.5M English words. The corpus is aligned at sentence level. All data files are SGML documents.

Please examine this Arabic example and this English example to review a sample of this corpus.

Please contact Xiaoyi Ma with any questions regarding this corpus.

Content Copyright

Portions © 2001-2004 Ummah Press Service

Portions © 2003-2004 Trustees of the University of Pennsylvania


About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Data

Contact: ldc@ldc.upenn.edu

(c) 1992-2010 Linguistic Data Consortium, University of Pennsylvania. All Rights Reserved.