User Tools

Site Tools


latentartists

====== Differences ====== This shows you the differences between two versions of the page.

Link to this comparison view

latentartists [2013/06/30 16:24]
ben
latentartists [2013/06/30 17:09] (current)
ben
Line 4: Line 4:
  
  
-=== Dataset - ===+==== Dataset - ====
  
  
-Assume a fixed vocabulary $V$, which in our experiments is a company internal ​list+Assume a fixed vocabulary $V$, which in our experiments is a list compiled by the Echonest
 of music related multiword terms. of music related multiword terms.
  
Line 22: Line 22:
  
  
-Modeling Approach -+==== Modeling Approach - ====
  
 Using Factor Analysis, each $x_i$ as Using Factor Analysis, each $x_i$ as
Line 30: Line 30:
 $x_i \sim \mathcal{N}(Wz,​\Psi)$ $x_i \sim \mathcal{N}(Wz,​\Psi)$
  
-Hypothesis -+==== Hypothesis - ====
  
 Much work that discovers similarity through low-dimensional representations such as PCA or Neural Networks treat Much work that discovers similarity through low-dimensional representations such as PCA or Neural Networks treat
Line 36: Line 36:
 a low dimensional space but also quantify our uncertainty about each dimension.  ​ a low dimensional space but also quantify our uncertainty about each dimension.  ​
  
-Method -+==== Method - ====
  
 The above model can be used to predict similar artists based on distance in the latent space. ​ The traditional ​ The above model can be used to predict similar artists based on distance in the latent space. ​ The traditional ​
Line 48: Line 48:
 $G = (I + W^T\Psi^{-1}W)^{-1}$ $G = (I + W^T\Psi^{-1}W)^{-1}$
  
 +Distance between can be computed with KL-divergence,​ which for Multivariate Gaussian'​s is given as
  
 +$KL(\mathcal{N}_0||\mathcal{N}_1) \propto (\mathbb{E}[z_0] - \mathbb{E}[z_1])\Sigma^{-1}(\mathbb{E}[z_0] - \mathbb{E}[z_1])^T + C$
  
-Evaluation - +if the covariance matrix $\Sigma$ is the same for both Gaussians. ​ This shows that if $\Sigma^{-1}$ is a multiple 
 +of the identity matrix, the ranking retrieved will be the same as that of Euclidian distance between posterior means.  
 + 
 +We can calculate the artists that are similar to an arbitrary artist by calculating their distance to all other artists using one of these  
 +metrics and applying a threshold.  
 + 
 +==== Evaluation - ====
  
 We evaluate prediction of similarity on the top 300 artists by Echonest "​hotttness",​ a set we will call $\mathcal{H}$.  ​ We evaluate prediction of similarity on the top 300 artists by Echonest "​hotttness",​ a set we will call $\mathcal{H}$.  ​
 We use the official artists similars from the Echonest database for each artist as the ground truth, provided that these We use the official artists similars from the Echonest database for each artist as the ground truth, provided that these
-similar artists are also in $\mathcal{H}$. ​ By varying the numeric thesh+similar artists are also in $\mathcal{H}$. ​ By varying the threshold on KL divergence or Euclidian distance we can trace out 
 +an ROC curve. 
 + 
 +Our results, contained in the ROC plots below, correspond to training on the full dataset and only the top 1000 by hotttness. ​  
 +In both experimental setups the same top 300 artists are used for evaluation, the only difference is the amount of information available  
 +during training. 
 + 
 +== Hottt 1000 ==
  
 +{{::​1000.jpg?​600|}}
  
 +== Full Dataset ==
  
 +{{::​full.jpg?​600|}}
  
 +The results do not support our hypothesis that taking uncertainty into account would create a more robust notion of similarity. ​
 +While both methods clearly capture the information in the Echonest artist similar lists, the area under the ROC curve is clearly
 +greater for the simple Euclidean distance based approach.  ​
  
 +The reason that the experimental results do not match our intuition is unclear. ​ One possibility is that KL divergence
 +is not an appropriate metric for similarity.  ​
  
  
latentartists.1372623887.txt.gz · Last modified: 2013/06/30 16:24 by ben