User Tools

Site Tools


deepunlearning

====== Differences ====== This shows you the differences between two versions of the page.

Link to this comparison view

deepunlearning [2013/06/30 16:15]
bmcfee
deepunlearning [2013/06/30 17:04] (current)
craffel
Line 1: Line 1:
-====== ​Idea ======+====== ​Deep Unlearning ​======
  
 MIR techniques rely upon accurate representations of acoustic content in order to produce high-quality results. ​ Over the past few decades, most research has operated on hand-crafted features, which work well up to a point, but may discard important information from the representation,​ thereby degrading performance. MIR techniques rely upon accurate representations of acoustic content in order to produce high-quality results. ​ Over the past few decades, most research has operated on hand-crafted features, which work well up to a point, but may discard important information from the representation,​ thereby degrading performance.
Line 14: Line 14:
 ====== Implementation ====== ====== Implementation ======
  
 +Our implementation is written in Python, using the LibROSA library for low-level audio analysis, and Theano for feature learning.
 +
 +The model architecture is based upon the ''​convolutional_mlp.py''​ example from the DeepLearningTutorial,​ with the following modifications:​
 +  - The input layer operates on a short fragment of audio (~0.5s) represented as a $64\times 40$-dimensional Mel power spectrum.
 +  - Layer 1 consists of a bank of 2-dimensional convolutional filters. ​ Each filter is convolved with the input layer, and the resulting filter responses are downsampled by spatial max-pooling.
 +  - Layer 2 consists of a linear transformation of the pooled filter responses, followed by a bank of rectified linear units
 +  - Layer 3 is the output layer, which is implemented as a logistic regression classifier to predict which of $k$ known artists generated the input patch.
 +
 +The model is trained by full stochastic gradient descent using a learning rate of $0.05$ and batches of 80 randomly selected input patches. ​ The objective function is cross-entropy of the output layer against the true label, combined with $\ell_2$-regularization of the filter weights and output layer parameters.
 ====== Our stuff ====== ====== Our stuff ======
  
Line 24: Line 33:
   * [[https://​github.com/​Theano/​Theano|Theano]]   * [[https://​github.com/​Theano/​Theano|Theano]]
   * [[https://​github.com/​bmcfee/​librosa|LibROSA]]   * [[https://​github.com/​bmcfee/​librosa|LibROSA]]
 +
 +====== Authors ======
 +  * Brian McFee
 +  * Nicola Montecchio
 +
deepunlearning.1372623327.txt.gz ยท Last modified: 2013/06/30 16:15 by bmcfee