Human Subjects Collection

LDC builds databases of speech from human subjects across a wide range of languages. Our collection efforts address several different communication modes including interviews, telephone conversations between familiars and strangers, written forms (SMS/chat and handwriting), multichannel dialogs, transcript and wordlist reading, and comms-audio/two-way radio speech.

  • Soundbooth: Designed to isolate the speaker from extraneous noise sources, this room has an acoustical seal and drop sweep on the door, a multi-pane window and two layers of acoustic treatment on the walls and ceiling. To minimize noise and equipment within the soundbooth, all audio and control signals are routed via customized wallplates between the soundbooth interior and a rackmounted digital audio workstation located outside of the soundbooth. The digital audio workstation, multichannel digital-audio interface and TCP/IP controllable microphone preamplifier are all installed in an acoustically isolated half-height rack. The current component list is as follows:
    • Apple Mac Pro, Logic Express
    • Lynx Studio Technology LST16e
    • Apogee 88192 8-channel A/D Interface
    • Millennia HV-3R 8-channel Microphone Preamplifier

LDC maintains a library of microphones for use in the Soundbooth and during collection projects. These include:

    • Earthworks QTC40 Matched Pair: Condenser, small diaphragm, omnidirectional, wide dynamic range, flat,calibrated frequency response, extremely low noise
    • Earthworks QTC30: Condenser, small diaphragm, omnidirectional, wide dynamic range, flat frequency response, extremely low noise
    • DPA 4091: Condenser, Small Form Factor, small diaphragm, omnidirectional, high SPL handling
    • AKG C214: Condenser, large diaphragm, cardioid, studio use
    • Electrovoice EV-RE20: Dynamic, large diaphragm, cardioid, broadcaster use
    • Audio-Technica AT4022: Condenser, small diaphragm, omnidirectional, general purpose
    • Shure Beta 53 Headset
    • DPA 4066 Headset
  • LDC Telephone Speech Collection Systems: LDC maintains multiple computer telephony systems for collecting speech from the telephone network. Each system supports connection to one T-1 line, providing 24 audio channels. This is adequate to record twelve 2-person conversations simultaneously. Each collection system's telephony hardware performs interactive voice response and call logging functions. One system also includes a T-1 passive-tap call logging board, a multi-session SIP logging telephony board and the Numonix Call Logging Application. LDC has deployed one VoIP telephony server using the Asterisk softswitch application framework, one Sangoma ISDN/PRI T1 Quad interface telephony board and one Sangoma D150 Voice Transcoding Board. Customized IVR software developed at LDC and installed on each system handles all interactions with talkers and their connections and controls recordings. Supporting software handles automatic transfers of recordings to the main LDC network. Finally, the LDC has developed an Asterisk-based VoIP remote call collection system for deployment outside of North America.