GMM estimates of PDFs are only one of a large number of different available classification schemes. As a comparison, we now look at solving the same task with a neural network (NN), specifically a multi-layer perceptron with a single hidden layer. An NN of this kind is essentially just a complex nonlinear mapping whose parameters can be optimized to match some training data using gradient descent (the so-called back-propagation algorithm). By training it to predict the actual 1/0 singing label, it ends up giving us an estimate of the actual probability that a particular frame corresponds to voice. Netlab again makes the training very easy for us:
% Set up the training parameters
options = zeros(1,18);
options(9) = 1; % Check the gradient calculations
options(14) = 10; % Number of training cycles
nhid = 5; % Hidden units in network - analogous to Gauss components
nout = 1; % Single output is Pr(singing)
alpha = 0.2; % Controls learning rate - some experimentation needed
ndim = 2;
net = mlp(ndim, nhid, nout, 'logistic', alpha);
% Training is via a generalized optimization routine
net = netopt(net, options, ftrs(:,1:2), labs, 'quasinew');
Because we're still only classifying on two dimensions, we can again sample the network output over a range of values and see what we get. We can reuse the grid defined for the GMMs:
% Run the net 'forward' on the grid points
nno = mlpfwd(net, [x(:),y(:)]);
nno = reshape(nno, 100, 100);
subplot(221)
imagesc(xx,yy,nno)
axis xy
% Notice how MLP outputs are soft planar intersections
% Compare to GMM likelihood ratio
subplot(222)
imagesc(xx,yy,log(ppS./ppM))
axis xy
% Plot the actual decision regions
subplot(223)
imagesc(xx,yy,nno>0.5);
axis xy
subplot(224)
imagesc(xx,yy,log(ppS./ppM)>0)
axis xy
We can calculate the overall accuracy on the training data as before:
% Run the net on the training data
nnd = mlpfwd(net, ftrs(:,[1 2]));
% How well does it agree with the labels?
mean( (nnd>0.5) == labs) ans = 0.6500
% Pretty close to simple GMMs
Finally, we can again wrap all this up in a neat parameterized function, trainnns:
% Try again with 2 dimensions and 5 hidden units, trained for 10 iterations
net = trainnns(ftrs(:,[1 2]), labs, 5, 10); Accuracy on training data = 65.5% Elapsed time = 87.5088 secs
% There's a random element in the training, so results will vary from run to run
Try to find neural networks that parallel the complexity (e.g. training time) of the GMMs you investigated before. How do they compare in terms of accuracy?
Back: GMMs | Top | Next: Evaluation |