LabROSA : Projects :

[covers80 image] The covers80 cover song data set

As described on the main coversongs page, we have been researching automatic detection of "cover songs" i.e. alternative performances of the same basic musical piece by different artists, typically with large stylistic and/or harmonic changes. Although this task has been evaluated as part of MIREX since 2006, that is a closed task that is evaluated only once a year. To help in developing that task, we needed a standard data set with which to measure our progress. That is the purpose of the covers80 dataset, a collection of 80 songs, each performed by 2 artists.

list1.txt is a list of the "A" versions of the 80 songs, and list2.txt is the "B" versions of the same 80 songs, listed in the same order. There is no system behind whether a version appears in list1 or list2. Each entry is in the form of a directory and file name in the canonical file layout, e.g.


.. where we have used our standard canonicalization rules. The covers were assembled somewhat haphazardly. First we went through the 8764 pop music tracks in uspop2002, listening to any tracks with the same name to see if they were covers. That didn't yeild enough, so then we found as many pairs as we could for two albums of cover songs we happened to have, one by Annie Lennox ("Medusa") and one by Tori Amos ("Strange Little Girls"). The rest were collected even more randomly. You can download 32 kbps, 16 kHz mono versions of the entire set here: covers80.tgz (164MB). Here is the README file.


If you report a result using this dataset, you can reference this page as follows:

D. P. W. Ellis (2007). The "covers80" cover song data set
Web resource, available:

We also describe the corpus and compare the performance of several versions of our coversong system on this data set in our ISMIR07 paper:

D. Ellis and C. Cotton (2007)
The 2007 LabROSA Cover Song Detection System
MIREX 2007 Audio Cover Song Evaluation system description, Sep 2007. (4pp)
(See also the poster I presented at ISMIR-07.)


This material is based in part upon work supported by the National Science Foundation under Grant No. IIS-0238301. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).

This work was also supported by the Columbia Academic Quality Fund.

Valid HTML 4.0! Last updated: $Date: 2007/02/06 03:36:03 $
Dan Ellis <>