Data Scholarships

Important Dates

Submission Deadline for the Fall 2024 semester: September 15, 2024

Winner Notification: Rolling

Program Details

The data scholarship program provides eligible students with no-cost access to LDC data. These awards are subsidized by the Consortium to help talented individuals complete language research tasks.

Data scholarships are offered biannually during the fall and spring semesters. Students must complete an application and the selection process is highly competitive. Multiple students may receive scholarships during each program cycle.

Applicant Eligibility Requirements

Data scholarships are available to students pursuing undergraduate or graduate studies in an accredited college or university who do not otherwise have access to the necessary data. Scholarships are not restricted to any particular field of study; however, students must demonstrate a well-developed research agenda and a bona fide inability to pay for the requested data.

Application Process

The application consists of two parts:

1. Data Use Proposal: applicants must submit a proposal describing their intended use of the data. The proposal must contain the following:

  • The database(s) requested
  • A brief description of the research project
  • A description of how the data will be used and how success will be measured.

Consult the Catalog for a complete list of data distributed by LDC. Some corpora are available to Consortium members only. Applicants are advised to select a maximum of one to two databases per cycle. Students may submit an application in a subsequent cycle once they have processed the initial database(s) and have published or presented work in a juried venue.

Proposals should be a maximum of two pages in 12 point font, double-spaced.

2. Letter of Support: applicants must submit one letter of support from their thesis advisor or department chair. The letter must verify the student's need for data and confirm that the department or university lacks funding to pay the applicable nonmember license fee or to join the Consortium. The letter must also be printed on letter-head and include the signature of the thesis advisor or department chair.  

The letter of support should:

  • Describe the student: including the student's name, university name, field of study, etc. The advisor should elaborate on the type and duration of the relationship with the student.
  • Describe research: indicate the LDC data to be used as well as describe the student's research. The letter must evaluate the probability of success of the research.
  • Describe need: elaborate on the funding situation of the department and comment on the lack of funds.
  • Be printed on letter-head and include a signature 

Evaluation Criteria

Applications are evaluated along the following criteria.  Initially, LDC will reject applications that:

  • Are incomplete, i.e. either missing a Data Use Proposal or Letter of Support.  In particular, proposals with a letter of support that does not express the advisor's confidence in the proposal and student will be rejected.
  • Contain a Data Use Proposal or Letter of Support that does not meet the requirements stated above
  • Are received after the submission deadline

In addition LDC will rate a Data Use Proposal more highly if the student demonstrates the following:

  • An understanding of the database(s) requested.  For example, if a proposal aims to develop an entity tagging technology, then the requested corpus should be entity-tagged or the proposal must make a provision for adding such tags.
  • An evaluation methodology appropriate to the type of research proposed.  For example, for research in speech recognition, a proposal which plans to use a database, evaluation protocol and scorer already used in recognized evaluation campaigns, would be rated higher than one that did not.
  • A research methodology appropriate to the student’s field.  For example, proposals that adopt an accepted methodology or else will motivate an alternative methodology will be better rated than those that simply adopt a new methodology without justification.
  • Appropriate planning.  For example, if a proposal aims to process a very large corpus in a short amount of time, the student should mention how necessary computer resources will be deployed.

Submission materials will not be returned. All submissions are the property of LDC and its members.

Data scholarship recipients will be notified by email and will be announced publicly. Awardees must sign an additional user license agreement(s). Most corpora will be covered by the LDC User Agreement for Nonmembers. A list of corpus-specific user license agreements is available. Recipients must correctly cite awarded corpora in any scholarly paper.

Email applications to the LDC Data Scholarships program. Decisions will be sent by email from the same address.