Dan Ellis : Research : Music Similarity :

Artist ID using Matlab, netlab, and uspop2002


We are distributing a large set of precomputed features for 8764 songs from the uspop2002 data set. This page shows a brief example of how to use this data to train a simple classifier - to discriminate between two artists on the basis of a likelihood ratio test between Gaussian Mixture Models (GMMs) trained on each artist. We'll do this within Matlab, and make use of the very nice netlab machine learning toolbox.

We assume that Matlab has the "artist/" directory as its cwd, so we can simply read the HTK files in the subdirectories. We also assume that the standard uspop2002 data files are in "../doc" (as they are on the distribution DVD), and that readhtk.m is available in Matlab's path.

 % Load the master data file describing the data
 uspop = textread('../doc/uspop2002-aset.txt', '%s');
 % Reshape...
 uspop = reshape(uspop, 4, length(uspop)/4);
 % Now each row describes one track; columns are artist, album, track #, name
 % Read in the corresponding feature file names for each track
 usfiles = textread('../doc/uspop2002-aset-files.txt', '%s');
 % usfiles(i) is the filename for the track described by uspop(i,:)
 
 % Get the indices for all tracks by the Beatles, and by Abba
 ixbtls = find(strcmp(uspop(1,:), 'beatles'));
 ixabba = find(strcmp(uspop(1,:), 'abba'));
 % Concatenate features for the first 10 tracks by Abba
 da = [];
 for i = 1:10; da = [da;readhtk(char(usfiles(ixabba(i))))]; end
 % Similarly for the Beatles
 db = [];
 for i = 1:10; db = [db;readhtk(char(usfiles(ixbtls(i))))]; end
 % Train 10 mix GMMs over the 20 dimensional MFCCs, one for each artist
 ndim = 20;
 nmix = 10;
 % Initialize and train the 2 GMMs
 gma = gmm(ndim,nmix,'diag');   % gma is GMM for Abba
 gmb = gmm(ndim,nmix,'diag');   % gmb is GMM for Beatles
 options = foptions;     % default optimization options
 options(14) = 5;        % 5 iterations of k-means in initialization
 gma = gmminit(gma,da,options);
 gmb = gmminit(gmb,db,options);
 emtions = zeros(1, 18);
 emtions(14) = 20;       % Number of iterations of EM to do 
 % subsample the training frames by a factor of 10 to speed things up
 gma = gmmem(gma, da(1:10:end,:), emtions);
 gmb = gmmem(gmb, db(1:10:end,:), emtions);
 
 % Now, evaluate the log-likelihood ratio between the two 
 % models for the first 20 tracks from each artist.  A value 
 % greater than zero indicates classification as Abba, less than 
 % zero indicates the Beatles.  The first 10 tracks in each case 
 % are the training data (should do well!), whereas the next 10 
 % are different.
 for i = 1:20;  dd = readhtk(char(usfiles(ixabba(i)))); disp(num2str(mean(log(gmmprob(gma,dd)))-mean(log(gmmprob(gmb,dd))))); end
 1.7597
 1.6639
 2.5584
 1.3849
 3.0646
 2.8987
 1.1042
 3.2112
 1.8611
 2.3773
 1.9904
 1.3427
 1.9295
 1.6319
 0.28293
 0.2766
 2.1931
 0.62696
 0.55051
 -1.4684
 for i = 1:20;  dd = readhtk(char(usfiles(ixbtls(i)))); disp(num2str(mean(log(gmmprob(gma,dd)))-mean(log(gmmprob(gmb,dd))))); end
 -3.2117
 -2.3593
 -3.6062
 -4.2403
 -3.9227
 -3.3714
 -4.1777
 -3.6854
 -5.5681
 -4.4663
 -1.952
 -4.0578
 -2.258
 -3.9241
 -3.7512
 -1.6199
 -2.99
 -2.0967
 -2.8276
 -2.2747
 % Mostly, the Abba files have positive LLRs, and the Beatles have -ve LLRs, 
 % indicating correct classification.  However, the last Abba track is 
 % classified as Beatles - see what it is?
 usfiles(ixabba(20))
 ans = 
    'abba/Voulez-Vous/06-Does_Your_Mother_Know.htk'
 % I don't know this track, but maybe it has more Beatles-like instrumentation?

Valid HTML 4.0! Last updated: $Date: 2005/05/28 04:01:07 $
Dan Ellis <dpwe@ee.columbia.edu>