**This is an old revision of the document!** ----
====== Latent Artists ====== by Ben Swanson and Elif Yamangil Dataset - Assume a fixed vocabulary $V$, which in our experiments is a company internal list of music related multiword terms. Each item $d_i$, from an OO-programming point of view, has the following fields * Artist Name * Echonest ID * Echonest Genres * ML unigram model $x_i$, treating a sample of reviews for this artist as a bag of terms $w \in V$ Approach - Using Factor Analysis, each $x_i$ as $z_i \sim \mathcal{N}(0,\mathbf{I})$