Human Subjects Collection

LDC builds databases of speech from human subjects across a wide range of languages. Our collection efforts address several different communication modes including interviews, telephone conversations between familiars and strangers, written forms (SMS/chat and handwriting), multichannel dialogs, transcript and wordlist reading, and comms-audio/two-way radio speech.

  • Soundbooth: Designed to isolate the speaker from extraneous noise sources, this room has an acoustical seal and drop sweep on the door, a multi-pane window and two layers of acoustic treatment on the walls and ceiling. To minimize noise and equipment within the soundbooth, all audio and control signals are routed via customized wallplates between the soundbooth interior and a rackmounted digital audio workstation located outside of the soundbooth. The digital audio workstation, multichannel digital-audio interface and TCP/IP controllable microphone preamplifier are all installed in an acoustically isolated half-height rack. The current component list is as follows:
    • Apple Mac Pro, Logic Express
    • Lynx Studio Technology LST16e
    • Apogee 88192 8-channel A/D Interface
    • Millennia HV-3R 8-channel Microphone Preamplifier

LDC maintains a library of microphones for use in the Soundbooth and during collection projects. These include:

    • Earthworks QTC40 Matched Pair: Condenser, small diaphragm, omnidirectional, wide dynamic range, flat,calibrated frequency response, extremely low noise
    • Earthworks QTC30: Condenser, small diaphragm, omnidirectional, wide dynamic range, flat frequency response, extremely low noise
    • DPA 4091: Condenser, Small Form Factor, small diaphragm, omnidirectional, high SPL handling
    • AKG C214: Condenser, large diaphragm, cardioid, studio use
    • Electrovoice EV-RE20: Dynamic, large diaphragm, cardioid, broadcaster use
    • Audio-Technica AT4022: Condenser, small diaphragm, omnidirectional, general purpose
    • Shure Beta 53 Headset
    • DPA 4066 Headset
    • R0DE NT6: condenser, small diaphragm, cardioid, studio use
    • Audio Technica AT3035: condenser, large diaphragm, cardioid, studio use
    • Samson C01U: condenser, large diaphragm, cardioid, USB microphone
    • Audio Technica ATR35a: condenser, small diaphragm, cardioid, lavaliere
    • Shure MX185: condenser, lavaliere
    • Shure MX183: condenser, lavaliere
    • Crown PZM30D: condenser, PZM microphone
    • Audio Technica AT8015: condenser, Shotgun microphone
    • Audio Technica AT8035: condenser, shotgun microphone
    • Audio Technica Pro45: condenser, Hanging microphone
    • Audio Technica Pro24-CM:  condenser, stereo microphone
    • Acoustic Magic Voicetracker: array microphone
    • Shure VP64A: dynamic microphone
    • Optoacoustics Fiber-Optic microphone


  • LDC Telephone Speech Collection Systems: LDC maintains multiple computer telephony systems for collecting speech from study participants via the telephone network, including:
    • A current LDC-based system consisting of a T-1 connection with the capacity to record twelve simultaneous two-channel conversations; this system utilizes the Syntellect CT-ADE (Computer Telephone Advanced Development Environment) and the VOS7 programming language
    • One planned LDC based system using Asterisk computer telephony development environment and a SIP service with 24 channel capacity (for 12 recorded conversations) is under construction
    • Three remote SIP-based platforms utilizing Asterisk computer telephony development environment, each capable of recording 15 simultaneous conversations
    • LDC-based mirror systems with VoIP handsets that correspond to overseas telephone platforms and allow for remote application design and troubleshooting
    Customized Interactive Voice Recording (IVR) software, developed at LDC and installed on each system, controls recordings and handles all interactions and connections between speakers. Supporting software handles automatic transfers of recordings to the main LDC network. The two main software frameworks in use for telephony development are VOS and Asterisk. The primary telephony interfaces in use are Dialogic, AudioCodes and Sangoma. Hardware includes a VoIP telephony server using the Asterisk softswitch application framework, a Sangoma ISDN/PRI T1 Quad interface telephony board and a Sangoma D150 Voice Transcoding Board.


  • LDC Multichannel Collection System: LDC maintains a multichannel collection system to support collecting and routing speech for up to four interlocutors. The system can be customized to record speech from some or all speakers. Background noise, used to elicit specific speech properties, can be routed to each interlocutor individually. Speech can also be routed, mixed, and recorded under computer control. This allows for control of the specific inputs each speaker hears, such as other speakers’ voices, their own voice and background noise, and the level at which these inputs are heard. Both the clean speech channels and the mixed (background noise and speech) channels can be recorded simultaneously.
    The system includes routing and signal processing via a Lectrosonics DM1612 Analog Matrix Mixer, which includes 16 balanced analog inputs – eight microphone preamplifiers and eight line-level connections, and 12 balanced analog outputs. The Matrix Mixer uses 192 crosspoints at which the gain and mixture mode can be set individually. Each input can also be routed independently through the matrix to any output. Multichannel recording is created using a Digigram VX882e computer audio interface with eight balanced analog inputs and outputs along with 8 AES/EBU digital audio inputs and outputs.  Each interlocutor wears a BeyerDynamic DT290 closed-back headset, equipped with dynamic hypercardioid boom microphone – a close-talking mic that minimizes background speech and noise. The system is controlled using a Windows 10 digital audio workstation, which runs the Lectrosonics Control Panel, Audacity digital audio recording application, and the Digigram Low-Latency ASIO device driver.