American National Corpus Restricted Portion - Questions Frequently Asked by Copyright Holders

Is the ANC a commercial venture?

No. The ANC is a research project to support education, research and technology development related to American English. Universities, publishers and technology companies support the ANC because they have an interest in resources and tools for studying language. The Corpus will help everyone involved linguistic education and research, speech and language engineering and related language industries to understand how the English language works, and this will result in better dictionaries, grammars and teaching materials and will support the development of computer systems for information retrieval, machine translation and other types of natural language processing. When complete, the Corpus will be made available to organizations involved in the language industries who will pay a fee just to cover the costs of distribution.

What protection is there against misuse of the electronic texts in the Corpus?

The End User License states that the Licensee "must not ... copy, publish or otherwise give to any third party access to the whole or any part of the ANC Processed Material". The ANC Consortium will not include any text in its entirety in the Corpus. Nevertheless, the publishers in the ANC Consortium recognize the dangers inherent in sharing digital data and are as concerned as you that the Corpus should not allow the abuse of copyright texts. A text sample, by being included in the ANC, loses none of the protection of copyright law.

Exactly what uses will be made of the Corpus?

The End User License strictly controls the use of the Corpus and the text samples it contains. The right of reproduction of the individual original text samples by any means is explicitly forbidden. None of the text samples in their original form will be incorporated into any product. Quotations from text samples will be strictly limited by the Fair Use provisions of copyright law. It is the Corpus as whole, as a window on the whole language, together with its grammatical and semantic tagging, that is of interest for language research.

Potential uses of  the Corpus will typically include: compilation of a dictionary including information about words and their meanings as observed in the Corpus; the development of machine translation software based upon knowledge of English grammar as derived from the Corpus; the preparation of statistical information, for example word frequency counts, from the Corpus.

What fee is being offered for permission?

The ANC is a not-for-profit research project where each individual text forms less than 0.5% of the whole corpus. The texts are only included as representations of language in use. For these reasons, the ANC Consortium cannot offer a fee for permission.