Finding Musically Meaningful Words by Sparse CCA
A musically meaningful vocabulary is one of the keystones in building a computer audition system that can model the semantics of audio content. If a word in the vocabulary is not clearly represented by the underlying acoustic representation, the word can be considered noisy and should be removed from the vocabulary. This paper proposes an approach to construct a vocabulary of predictive semantic concepts based on sparse canonical component analysis (sparse CCA). The goal is to find words that are highly correlated with the underlying audio feature representation with the expectation that these words can me modeled more accurately. Experimental results illustrate that, by identifying these musically meaningful words, we can improve the performance of a previously proposed computer audition system for music annotation and retrieval.