HAVIC

Heterogeneous Audio Visual Internet Collection (HAVIC)

LDC built a large corpus of multi-modal data to support research in a variety of areas including spoken term detection and video event detection. The HAVIC (Heterogeneous Audio Visual Internet Collection) Corpus consists of thousands of hours of “real world” video data collected from the internet. The corpus especially targeted user-generated video content as opposed to professionally-produced or commercial video content.