HAMR 2013 Proceedings

====== Differences ====== This shows you the differences between two versions of the page.

--- glad [2013/06/30 22:34]
dawenl [Optimal Integration of Labels for Cal500 Dataset]
+++ glad [2013/07/01 18:31] (current)
dawenl [3 Future work]
@@ Line 5: / Line 5: @@
 | Code | [[https://github.com/dawenl/glad_cal500|Github Link]] |
-[[http://cosmal.ucsd.edu/cal/projects/AnnRet/|Cal500]] is a widely used dataset for music tagging. The tags it contains include instrumentation ("Electric Guitar"), genre ("Jazz"), emotion ("Happy"), usage ("For a Party"), etc. They were collected from human annotators and integrated by "majority voting". However, considering the expertise from different annotators and the difficulty of different pieces, we can come up with a better statistical model for optimal label integration, which would ideally infer the label, as well as the expertise of the annotators and the difficulty of the songs. This work is primarily based on [[http://mplab.ucsd.edu/~jake/OptimalLabeling.pdf|this paper]] in NIPS 2009.
+[[http://cosmal.ucsd.edu/cal/projects/AnnRet/|Cal500]] is a widely used dataset for music tagging. The tags it contains include instrumentation ("Electric Guitar"), genre ("Jazz"), emotion ("Happy"), usage ("For a Party"), etc. They were collected from human annotators and integrated by "majority voting" (The tags that most people annotated are kept). However, considering the expertise from different annotators and the difficulty of different pieces, we can come up with a better statistical model for optimal label integration, which would ideally infer the label, as well as the expertise of the annotators and the difficulty of the songs. This work is primarily based on [[http://mplab.ucsd.edu/~jake/OptimalLabeling.pdf|this paper]] in NIPS 2009.
 ===== - Model =====
@@ Line 40: / Line 40: @@
 I fit the model to instrument-based labels and genre-based labels as they are simple and easy to understand (plus for now the model I implemented only support binary labels).
-==== - Solo v.s. Instrument ====
+==== - Instruments as solo v.s. background ====
 One thing which is interesting to see is how the annotators are good at labeling instruments as "Solo" (e.g. "Piano Solo", "Electric Guitar Solo"), as opposed to just labeling instrument as background.
@@ Line 48: / Line 48: @@
 The histogram above shows both the distribution of average expertise $\hat{\alpha}$ of labeling instrument as background and as solo. We can that there is no overlapping, indicating the annotators are significantly better at annotating instruments as solo than as background.
-==== - Difficulty of different instruments ====
+==== - Difficulty of labeling different instruments ====
 We can interpret the average expertise $\hat{\alpha}$ to label instrument-based tags as a reflection on how difficult it is to label the corresponding instruments correctly. Below is the top 5 simplest instruments v.s. the top 5 hardest instruments in terms of $\hat{\alpha}$:
@@ Line 79: / Line 79: @@
 Not surprising, Jazz is hard.
+===== - Future work =====
+- At the moment, only binary labels are supported. But in fact, the model is easily extended to handle multinomial labels.
+- Now each individual label is treated completely independent. However, in the real world, it's easy to consider the correlation between different tags (e.g. "Rock" is definitely more positively-correlated to "Electric Guitar (Distortion)" than "Sampler"). This can be done by the similar idea from Correlated Topic Model ([[http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2005_774.pdf|CTM]]).
+- An interesting yet challenging problem would be to integrate the noisy beat annotations to create better ground truth data for beat tracking tasks. The main difference is that in beat annotation, the labels are no longer discretized categories, instead they are temporally-dependent series, which makes the problem much more difficult.

HAMR 2013 Proceedings

User Tools

Site Tools

Page Tools