Title of corpus (proposed) *
Authors (comma separated) *
Languages (comma separated) *
Text
Data type 1 *
- Select - Yes No
Audio
Data type 2 *
- Select - Yes No
Video
Data type 3 *
- Select - Yes No
Lexicon
Data type 4 *
- Select - Yes No
Tool
Data type 5 *
- Select - Yes No
Other(s)—please specify
Estimated Data Size (uncompressed) *
Data Size Units *
Bytes Kilobytes (KB) Megabytes (MB) Gigabytes (GB) Terabytes (TB)
Estimated number of tokens, words, hours of speech/video, etc. *
Data Sources *
Describe the nature of the source data and the methods of collections; e.g. broadcast, conversation, news, documentary, prompted/spontaneous speech, telephone, background noise, field/studio recordings, metadata (e.g., demographics), etc.
Description *
Provide a description of the corpus. Included here – and not in the “Data sources” field – should be descriptions of any annotation or mark-up added to the data, potential applications for the corpus and the intended audience. There may be some overlap between this field and the preceding field. You may use these fields at your discretion in order to provide the clearest possible description of the corpus content.