Parallel Text and the Rosetta Stone

 The Rosetta Stone

(Image © Hans Hillewaert / CC-BY-SA-3.0)

The Rosetta Stone is probably the most famous example of parallel text. Rediscovered in 1799, the stele contains a decree issued nearly 16 centuries earlier. The text in Ancient Egyptian hieroglyphs, Demotic script and Ancient Greek provided critical keys to the modern understanding of Ancient Egyptian written language and culture. Today human language researchers use parallel text in machine translation, lexicon induction and learning transfer grammars among other pursuits. LDC has developed and published parallel text in Arabic, Chinese, Czech, French, Hungarian, Korean, Russian, Spanish and Urdu, all paired with English. Genres include blogs, newsgroups, broadcast news and conversation, technical documents in the domains of biology, chemistry, computer security and semiconductors as well as newswire, parliamentary and other government documents. Source and translation are aligned automatically or with human judgment at the levels of documents, sentences and even words. To learn more, search the LDC Catalog with parallel in the keyword field, with machine translation in the application field, or by language.