Please fill out the following form as completely as possible. If more than one choice applies, list other choices in the "other/notes" box. Mandatory fields are marked with an asterisk (*).

Primary contact:
Name*:
E-mail*:
Phone:

Title of corpus (proposed)*:
Version:

Authors (comma-separated):


Languages (comma-separated):


Data type* (select all that apply):
<--Text <--Audio <--Video <--Lexicon <--Tool
Other(s)--please specify:

Estimated delivery date* (When would you be able to provide complete data and documentation to LDC?):


Corpus size (Fill in all that apply. Provide best estimate if exact numbers are not available):
Data size (uncompressed)*:
Hours of Audio or Video:
Number of Words:
Number of Tokens:
Number of Decisions (e.g. for entity annotations):

Text data details (for any text, including transcriptions for audio and video corpora, text elements of lexicons, etc.):
Character encoding: other/notes:
Format/Structure: other/notes:
Markup schema (e.g. NITF, NewsML, MDE, TIMEX2), include specification url if applicable:

Audio data details (including audio tracks for video corpora):
Sample rate: other/notes:
Audio file extension: other/notes:
Bit depth: other/notes:
Sample format: other/notes:
Channel count: other/notes:

Video data details:
Container: other/notes:
Codec: other/notes:
Broadcast standard: other/notes:
Frame rate: other/notes:
Frame size: other/notes:

Lexicons:
Format: other/notes:

Tools, Other:
Type: other/notes:

Ownership:
Do you own all of the data in this corpus?:
If "No" or "Not sure", please further explain your answer:


Distribution constraints:
Do you expect that there will be any external contraints on the distribution of this corpus (e.g. release date, price, licensing)?:
If "Yes" or "Not sure", please explain your answer:


Data sources:
Please describe the nature of the source data and the methods of collection; e.g. broadcast, conversation, news, documentary, prompted/spontaneous, demographics, intended audience, telephone, background noise, field/studio recordings, etc.


Description*:
Please provide a description of the corpus. Included here -- and not in the "Data sources" field -- should be descriptions of annotation (including annotation specification url), transcription, post-processing, feature extraction, etc. There may be some overlap between this field and the preceding field. You may use these fields at your discretion in order to provide the clearest possible description of the corpus contents.