====== Differences ====== This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
deepdreameffect [2015/10/25 11:09] dmr |
deepdreameffect [2015/11/05 04:57] (current) dmr |
||
---|---|---|---|
Line 4: | Line 4: | ||
| **Affiliation** | International Audio Laboratories Erlangen | | | **Affiliation** | International Audio Laboratories Erlangen | | ||
| **eMail** | [[christian.dittmar@audiolabs-erlangen.de]] | | | **eMail** | [[christian.dittmar@audiolabs-erlangen.de]] | | ||
+ | | **code** | [[https://github.com/stefan-balke/hamr2015-deepdreameffect]] | | ||
===== What did I do ===== | ===== What did I do ===== | ||
- | I used Google's DeepDream processing as an audio effect. Therefore, I export music magnitude spectrogram as RGB channels of PNG images and apply to 'Gradient Ascent' with pre-trained networks to these images. Afterwards, I import these images again and resynthesize them using Griffin and Lim's method. | + | I used Google's DeepDream processing as an audio effect. Therefore, I export music magnitude spectrogram as RGB channels of PNG images and apply so-called 'Gradient Ascent' with pre-trained networks to these images. Afterwards, I convert the resulting images to spectrograms again and resynthesize them using Griffin and Lim's method. |
- | {{ :overview.png?nolink&400 |}} | + | {{ :overview.png?nolink&800 |}} |
Since the networks were trained on natural images, this makes no sense musically. However, it gives interesting results: | Since the networks were trained on natural images, this makes no sense musically. However, it gives interesting results: | ||
Line 15: | Line 16: | ||
===== Example 1: Piano ===== | ===== Example 1: Piano ===== | ||
- | {{ :shenua.wav |}} | + | Input signal {{ :shenua.wav |}} |
- | {{ :output_shenhua_layer3.wav |}} | + | Result using layer conv3 (MIT places network){{ :output_shenhua_layer3.wav |}} |
- | {{ :output_shenhua_layer5.wav |}} | + | Result using layer pool5 (MIT places network){{ :output_shenhua_layer5.wav |}} |
===== Example 2: Ethno ===== | ===== Example 2: Ethno ===== | ||
- | {{ :olcay.wav |}} | + | Input signal {{ :olcay.wav |}} |
- | {{ :output_olcay_layer3.wav |}} | + | Result using layer conv3 (MIT places network) {{ :output_olcay_layer3.wav |}} |
- | ===== Example 3: Separated Breakbeat ===== | + | ===== Example 3: Breakbeat ===== |
- | {{ :amenbrotherbreaknorm_mix.wav |}} | + | Input signal (Different drums encoded as RGB) {{ :amenbrotherbreaknorm_mix.wav |}} |
- | {{ :output_amen_layer3.wav |}} | + | Result using layer conv3 (MIT places network) {{ :output_amen_layer3.wav |}} |
+ | ===== Libraries Used ===== | ||
+ | |||
+ | Anaconda Python Package | ||
+ | Caffe Deep Deep Learning Framework | ||
+ | Pre-Trained Networks | ||
+ | iPython Notebook | ||
+ | MATLAB |