1) What can I purchase as a non-member?
Almost all of the corpora distributed by the Linguistic Data Consortium are available to non-members at the Non-member Fee. A few corpora, marked "Members Only" (MO), are not available to non-members due to restrictions from the copyright owners. Refer to the LDC Catalog for prices and availability.
The most cost effective way to obtain LDC data is the ongoing purchase of membership from year to year.
No, non-members must sign "research only" agreements. Some corpora have specific user agreements, but most corpora are covered by our generic LDC User Agreement for Non-Members. (A single Generic User Agreement can be used to cover any number of LDC corpora for which it is appropriate.)
Refer to the LDC Catalog to see which agreements are needed for particular corpora.
To license a corpus from the LDC, non-members should first complete a user license. Use of most of LDC corpus is governed by our generic LDC User Agreement for Non-Members. Some corpora require corpus-specific user licenses. Each corpus description page will have a link to the required license agreement. Please see the "Non-member License: Yes" link for the appropriate user license. User licenses can be faxed to +1 215 573 2175 or scanned and emailed to firstname.lastname@example.org. Next, non-members should provide payment or purchase order for the Non-member Fee. Payment can be made one of three ways:
Yes, non-members must add a shipping charge for each order if the order contains CD/DVD-ROM: $30 US and Canada, $50 overseas.
Although there is no one standard format for citing electronic resources, LDC suggests the following format based on MLA Style guidelines. Examples:
Liberman, Mark, et al. Emotional Prosody Speech and Transcripts LDC2002S28. CD- ROM. Philadelphia: Linguistic Data Consortium, 2002. Huang, Shudong, David Graff and George Doddington. Multiple-Translation Chinese Corpus LDC2002T01. FTP FILE. Philadelphia: Linguistic Data Consortium, 2002.
For many years, the LDC has provided electronic delivery of relatively small corpora (typically, data sets that could be compressed down to 40 megabytes or less) via an FTP mechanism; in recent years, very small corpora (less than 4 MB) have been E-mailed to recipients as MIME attachments.
Although FTP/E-mail delivery worked most of the time for most people, we know there have been numerous problems, affecting both new and experienced users (file-not-found on the ftp server, e-mail account overwhelmed by too much data, security measures at the user's site that blocked ftp access and/ or email attachments, and so on).
To put an end to these problems, and provide more effective access to these corpora, we have switched to a new web-based distribution method, using the LDC's intranet.
The LDC Intranet provides login access to a variety of LDC resources. Anyone can establish a "guest" login account free of charge, and the number of resources available to guest accounts will be increasing. Individuals who have obtained corpora from the LDC, or who are affiliated with organizations that have received LDC corpora, already have a wider range of intranet resources available. Among these is the new "Corpus Download" service, for corpora that are small enough to deliver electronically.
Connecting to the intranet is simple: your "user name" is your email address; if you have forgotten (or never knew) your password, there is a link on the login page to reset your password: if your email address is on record, a new password will be sent to you. (If your email address is not on record but you have received data from LDC in the past, please send email to email@example.com for assistance. If you are a first-time user, simply create a new user account, and refer to the paragraph about "guest" accounts below.)
Once you are logged in, you'll see a page containing links to the intranet resources available to you. If your user account is affiliated with an organization that has received any of the smaller corpora from the LDC (or if you personally have received such corpora from the LDC), one of the links you'll find is "Corpora Available for Download".
When you follow this link, you'll see a complete list of all corpora that have ever been requested by / delivered to you or your organization and are small enough for electronic transfer. The listing shows the LDC Catalog ID and Name for the corpus, the date of the LDC invoice that licensed that corpus to your organization, the number of times and most recent date (if any) that the corpus was downloaded over the web, and a button to start downloading the corpus. (The Catalog ID string is a link to the LDC's web catalog page describing the corpus.)
You and other recognized members of your organization may download any corpus listed on this page, whenever and as often as you see fit.
SPECIAL NOTE ABOUT GUEST ACCOUNTS:
Whenever someone uses the LDC intranet login page to create a new user account, they are asked to identify an organization that they belong to. The organization could be "Guest", meaning the new user will not be affiliated with any organization that has ever done business with the LDC.
But if the new user cites an organization that is already in our records, a notification will be sent to one or more known points of contact (POC) at that organization, informing them that a new LDC intranet account has been opened by someone asserting to be affiliated with them; the POC must then use his or her own LDC intranet login account to review the information about the new user, and either authorize or reject the affiliation. Once the POC authorizes the new user as a recognized member of the organization, that user will have access to the Corpus Download page (as well as all other intranet resources available to the organization).
Any individual who obtains corpora from the LDC (whether privately or on behalf of an organization) will automatically be set up with an intranet user account that supports access to the Corpus Download page -- this is a side-effect of the LDC's invoicing process, which includes the recipients acceptance of the necessary license agreement(s) for the corpora.
For inquiries e-mail firstname.lastname@example.org.
Please send payment and a signed user agreement to the following address:
Specific questions should be addressed to email@example.com.
About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Data