LDC develops tools to support evolving annotation tasks. All tools distributed by LDC are available at no cost under an open source license.
- LDC Broad Phonetic Class Speech Activity Detector: Based on the broad phonetic class recognizer implemented in the HTK Speech Recognition Toolkit [1], LDC’s speech activity detector model runs the speech signal through a GMM-HMM recognizer to identify five broad phonetic classes: vowel, stops/affricate, fricative, nasal and glide/liquid. The LDC Broad Phonetic Class Speech Activity Detector is available on github [2] under a GPL v3 license [3].
- AGTK, Annotation Graph Toolkit: Annotation Graphs are a formal framework for representing linguistic annotations of time series data. They abstract away from file formats, coding schemes and user interfaces, providing a logical layer for annotation systems. AGTK is a toolkit for using the Annotation Graph model. AGTK is made available under the Common Public License [4].
- CTK, Champollion Toolkit: CTK aims to provide ready-to-use parallel text sentence alignment tools for as many language pairs as possible. CTK is made available under the GNU General Public License, version 3.0 [5].
- LDC Word Aligner [6]: A tool used to build manual word alignments. LDC Word Aligner is made available under the GNU General Public License, version 3.0 [5].
- SPHERE Conversion Tools [7]: Programs for converting NIST SPHERE speech files to other formats.