|Affiliation||International Audio Laboratories Erlangen|
I used Google's DeepDream processing as an audio effect. Therefore, I export music magnitude spectrogram as RGB channels of PNG images and apply so-called 'Gradient Ascent' with pre-trained networks to these images. Afterwards, I convert the resulting images to spectrograms again and resynthesize them using Griffin and Lim's method.
Since the networks were trained on natural images, this makes no sense musically. However, it gives interesting results:
Input signal Result using layer conv3 (MIT places network) Result using layer pool5 (MIT places network)
Input signal Result using layer conv3 (MIT places network)
Input signal (Different drums encoded as RGB) Result using layer conv3 (MIT places network)
Anaconda Python Package Caffe Deep Deep Learning Framework Pre-Trained Networks iPython Notebook MATLAB