HAMR 2013 Proceedings

====== Latent Artists ====== by Ben Swanson and Elif Yamangil ==== Dataset - ==== Assume a fixed vocabulary $V$, which in our experiments is a list compiled by the Echonest of music related multiword terms. Each item $d_i$ in our dataset $D$, from an OO-programming point of view, has the following fields * Artist Name * Echonest ID * Echonest Genres (used for qualitative evaluation) * ML unigram model $x_i$, treating a sample of reviews for this artist as a bag of terms $w \in V$ $|V| = 3368$ $|D| = 23541$ ==== Modeling Approach - ==== Using Factor Analysis, each $x_i$ as $z_i \sim \mathcal{N}(0,\mathbf{I})$ $x_i \sim \mathcal{N}(Wz,\Psi)$ ==== Hypothesis - ==== Much work that discovers similarity through low-dimensional representations such as PCA or Neural Networks treat each data point as a single point in space. By taking the Bayesian approach described above we can not only embed data in a low dimensional space but also quantify our uncertainty about each dimension. ==== Method - ==== The above model can be used to predict similar artists based on distance in the latent space. The traditional approach would be to represent artist $d_i$ with its posterior mean $\mathbb{E}[z_i]$, and measure Euclidian distance. Our alternative computes distance with KL divergence between full posteriors. The posterior probability is given as $z_i \sim \mathcal{N}(\mathbb{E}[z_i],G)$ where $G = (I + W^T\Psi^{-1}W)^{-1}$ Distance between can be computed with KL-divergence, which for Multivariate Gaussian's is given as $KL(\mathcal{N}_0||\mathcal{N}_1) \propto (\mathbb{E}[z_0] - \mathbb{E}[z_1])\Sigma^{-1}(\mathbb{E}[z_0] - \mathbb{E}[z_1])^T + C$ if the covariance matrix $\Sigma$ is the same for both Gaussians. This shows that if $\Sigma^{-1}$ is a multiple of the identity matrix, the ranking retrieved will be the same as that of Euclidian distance between posterior means. We can calculate the artists that are similar to an arbitrary artist by calculating their distance to all other artists using one of these metrics and applying a threshold. ==== Evaluation - ==== We evaluate prediction of similarity on the top 300 artists by Echonest "hotttness", a set we will call $\mathcal{H}$. We use the official artists similars from the Echonest database for each artist as the ground truth, provided that these similar artists are also in $\mathcal{H}$. By varying the threshold on KL divergence or Euclidian distance we can trace out an ROC curve. Our results, contained in the ROC plots below, correspond to training on the full dataset and only the top 1000 by hotttness. In both experimental setups the same top 300 artists are used for evaluation, the only difference is the amount of information available during training. == Hottt 1000 == {{::1000.jpg?600|}} == Full Dataset == {{::full.jpg?600|}} The results do not support our hypothesis that taking uncertainty into account would create a more robust notion of similarity. While both methods clearly capture the information in the Echonest artist similar lists, the area under the ROC curve is clearly greater for the simple Euclidean distance based approach. The reason that the experimental results do not match our intuition is unclear. One possibility is that KL divergence is not an appropriate metric for similarity.

HAMR 2013 Proceedings

User Tools

Site Tools

Page Tools