One source of ground truth reference information for musical artist similarity is opinions given by human experts. In an attempt to capture and quantify this, we grabbed a number of web pages from the All Music Guide, a comprehensive and professional directory of information on musical artists. In particular, we used the "Similar artists" list provided for each band to capture a list of the most similar, popular bands for each query.
In order to extend these first-order neighbor links to a richer, more complete matrix of similarity comparisons, we calculated an "Erdös" number for each pair of bands we considered: What is the smallest number of hops by which we could get from one band to another, moving only to bands in the "Similar artists" list at each step?
We focused on a set of 400 artists, chosen as the most highly represented in a survey of online music collections in the OpenNap network in Summer 2000 (see our OpenNap data page). The artist list is given in aset400.txt. This list of 400 artists defines a 400x400 square matrix of Erdös distances, with 0 down the leading diagonal, and up to 13 steps required to connect certain artists. The full matrix, with ordering defined by the aset400 list, is in aset-erdos-steps.txt.
26 of the artists had no linkage to the others through the Similar Artists lists, so their stepcounts are uniformly -1 (except for self-count). The most distant relationships in the set are 13 counts long, between wade_hayes and both miles_davis and war (whereas the distance between miles_davis and war is only 2 steps, via sly_and_the_family_stone).
More information about this measure, and what we did with it, is in our paper for ISMIR-02, The Quest for Ground Truth in Musical Artist Similarity.