Supervised Chord Recognition for Music Audio in Matlab |
As our submission into the MIREX 2008 Audio Chord Detction evaluation, we developed a simple chord recognizer that builds models of chord classes based on labeled data (supervised training). This page provides access to the code and data used in our system.
For a simple, yet effective, Gaussian-HMM chord recognition system, check out practical 10 of ELEN E4896 Music Signal Processing. Be sure to invoke the suggested power-law compression in the feature loading routine!
We used Chris Harte's Beatles Chord Transcription data. Our system uses beat-synchronous chroma features. You can download these features across all 180 Beatles tracks: beatles-chromftrs.tgz (7.9 MB).
In 2010, we submitted several systems based on the svm-hmm classifier, both in pretrained and train-test guises. The codebase below encompasses these variants.
Although in 2008 we submitted a chord recognition system that took labeled data as input to train new models, for 2009 we submitted a pretrained system that comes ready to label the chords in input audio. This is based on the same codebase as the 2008 submission, but we trained it in-house on the 180 Beatles tracks labeled by Chris Harte. In addition to the recognition system, we are releasing the data we used to train, since we ended up having to make extensive manual realignment for every track to make Chris's label files line up properly with our audio.
We provide our code package, dpwechordrecog-20080825.zip as submitted to the MIREX competition, which consists mainly of Matlab, but also includes the Jesper Jensen's ISP Toolbox for its optimized, MEX-based chroma calculation, and Kevin Murphy's HMM Toolbox for the HMM Viterbi decoder.
Both the training labels and the classifier output files are in the format
<start_time_in_sec> <label> <start_time_in_sec> <label> ....
where the label is an integer in the range 0 to 24. 0 to 11 represent major chords C, C#, ... B; 12 to 23 represent minor chords similarly, and 24 is the "no chord" symbol. Of course, the precise identity of these chords depends on the labels in the training data, but the transposition relationships between 0..11 and 12..23 are used in the algorithm - it transposes the chroma of the labeled chords back to C, and builds just a single major chord model, and a single minor chord model..
This material is based upon work supported by the National Science Foundation under Grant No. IIS-07-13334. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).