Dan Ellis :
Music Similarity: Raw Data and Statistics
As part of our research into music similarity and music recommendation
systems, we have been looking at the problem of ground-truth information
and how to collect subjective opinions about music similarity. We are
committed to sharing this information with other interested researchers,
and these pages provide access to the data we have collected and
processed. Where practical, we provide data in its most neutral
and most useful form i.e. after regularization and tokenization, but
before distillation to specific forms we use.
Currently, we have descriptions for the following data sources:
There are also some ancilliary resources:
- Descriptions of the various metrics we defined for comparing similarity measures against ground-truth, and against one another.
- aset400 is our list of 400 current US pop music artists about whom all the ground truth has been collected.
- uspop2002 is the definition of a corpus of 8764 tracks from the aset400 artists that we have used in our recent acoustic-based similarity experiments. We will distribute features
derived from these 8764 tracks given enough interest.
- Some notes on text normalization of artist, album, and track names.
Here are some of our publications that relate to this data:
- M. Mandel, D. Ellis (2005).
Song-Level Features and Support Vector Machines for Music Classification
- Proc. Int. Conf. on Music Info. Retrieval ISMIR-05, London, September 2005. (6pp)
- A. Berenzweig, B. Logan, D. Ellis, B. Whitman (2004).
A large-scale evaluation of acoustic and subjective music-similarity measures
- Computer Music Journal, 28(2), pp. 63-76, June 2004. (14pp)
- B. Logan, D. Ellis, A. Berenzweig (2003).
Towards Evaluation Techniques for Music Similarity
- White paper (keynote!) at the Workshop on the Evaluation of Music Information Retrieval (MIR) Systems at SIGIR-03, Toronto, August 2003. (5pp)
- A. Berenzweig, B. Logan, D. Ellis, B. Whitman (2003).
A large-scale investigation of acoustic and subjective music similarity measures
- Submitted to ISMIR-03, Baltimore, October 2003. (8pp)
- A. Berenzweig, D.P.W. Ellis & S. Lawrence (2003).
Anchor Space for Classification and Similarity Measurement of Music
- Proc. ICME-03, Baltimore, July 2003. (4pp)
- D. Ellis, B. Whitman, A. Berenzweig, S. Lawrence (2002).
The Quest for Ground Truth in Musical Artist Similarity
- Proc. ISMIR-02, Paris, October 2002. (8pp)
This material is based in part upon work supported by the National
Science Foundation under Grant No. IIS-0238301. Any opinions, findings
and conclusions or recomendations expressed in this material are those
of the author(s) and do not necessarily reflect the views of the
National Science Foundation (NSF).
Last updated: $Date: 2005/11/16 19:16:52 $
Dan Ellis <firstname.lastname@example.org>