HAMR@ISMIR 2015 Proceedings

====== Differences ====== This shows you the differences between two versions of the page.

--- deepcomposer [2015/10/25 09:08]
bigpianist [Data Format]
+++ deepcomposer [2015/10/25 17:28] (current)
eraoul [Summary]
@@ Line 2: / Line 2: @@
 | Authors | Anna Aljanaki, Stefan Balke, Ryan Groves, Eugene Krofto, Eric Nichols |
-| Affiliation | Fake University |
 | Code | [[https://github.com/stefan-balke/hamr2015-lstm-music-gen|Github Link]] |
@@ Line 9: / Line 8: @@
   * Collect several symbolic song datasets, with melody and possibly chords
   * Represent data in a common vector format appropriate for input to a neural net
-  * Develop an LSTM architecture for generation of melody/chord output.
+  * Develop a Long Short-Term Memory (LSTM) architecture for generation of melody/chord output.
-  * **Goal:** Given a chord sequence, generate a melody.
+  * **Goal:** Given a melody and chord sequence, generate melody with chords.
   * Make music!
   * //Hopes:// Bias the network by training it with different combinations of music (e.g., ESAC + WJD = Folk songs with jazz flavour)
@@ Line 19: / Line 18: @@
     * http://theory.esm.rochester.edu/rock_corpus/
     * 200 songs
-  * Essen folk song collection: http://www.music-ir.org/mirex/wiki/2007:Symbolic_Melodic_Similarity
+  * Essen folk song collection: http://www.esac-data.org/data/
   * Wikifonia: http://www.synthzone.com/files/Wikifonia/Wikifonia.zip
   * WeimarJazzDatabase: http://jazzomat.hfm-weimar.de
@@ Line 39: / Line 38: @@
 ==== Datasets Used ====
-We decided to use three separate databases in order to validate that the results we were getting related to the data that we used in training. We chose from different styles for that reason
+We decided to use three separate databases in order to validate that the results we were getting related to the data that we used in training. We chose from different styles for that reason.
 **Rolling Stone 500**
@@ Line 45: / Line 44: @@
 **Essen Folksong Database**
-The [[http://www.esac-data.org/|Essen Folksong Database]] provides over 20,000 songs in digital format.
+The [[http://www.esac-data.org/|Essen Folksong Database]] provides over 20,000 songs in digital format. We used a dataset of 6008 songs which are in the public domain.
 **Weimar Jazz Database**
 The [[http://jazzomat.hfm-weimar.de/dbformat/dboverview.html|Weimar Jazz database]] provides a digital format of Jazz lead sheets.
 ==== Data Format ====
 **Pitch**
@@ Line 65: / Line 68: @@
 {{::metricquarter.png?800|An example of the metrical hierarchy in which the minimum beat is a 1/4 note (Lerdahl and Jackendoff, 1984, p. 19)}}
 An example of the metrical hierarchy in which the minimum beat is a 1/4 note (Lerdahl and Jackendoff, 1984, p. 19)
 Because our minimum time unit was a 1/16th note, our metrical hierarchy looked more similar to the following:
 {{::metricexample.png?800|}}
 An example of the metrical hierarchy in which the minimum beat is a 1/16 note. This hierarchy is identical to ours, however the example shows a song in 2/4, while we assumed 4/4 for each song's time signature. Therefore, ours is equivalent to the pictured hierarchy spanning two consecutive measures. (Lerdahl and Jackendoff, 1984, p. 23)
@@ Line 75: / Line 81: @@
 Our harmony encoding simply consisted of a separate 12-unit vector with the pitch-classes of each tone that was part of the underlying harmony set to one. Our hope was that the LSTM would intuit that the harmony related to the melody in the same time slice.
+The Essen folk song collection does not include harmony, only monophonic melodies. We added chords ourselves, using a simplistic approach. Namely, the chords change every measure (there is only one chord associated with each measure). We find the suitable chord by creating a pitch class histogram for a measure (which takes into account the duration of the notes that sounded in the measure) and finding the smallest cosine distance with a mask of 24 major and minor chord triads.
 ** Encoding **
@@ Line 86: / Line 95: @@
 {{::jazzexample.png?800|}}
-An example of one of the songs from David Temperley and Trevor Declerc's Rolling Stone 500 data set, after being formatted into a matrix of relative pitch values with their corresponding metric onset level (Note: harmony is omitted).
+An example of one of the solos the Weimar Jazz Database (Note: harmony is omitted).
+{{:esac_harm.png?800|}}
+Here is an example. A song "Es flog ein klein Waldvogelein" is accompanied by chords (the long stripes under the melody are chords).
+{{::rockwithharmony.png?800|}}
+Another example of the rock corpus, the song "1999" by Prince. This time with harmony.
 ==== Neural Network ====
@@ Line 100: / Line 119: @@
   * 12 Pitch Classes (Chroma) with chord information.
   * 5 levels of the metrical hierarchy.
 ===== Libraries Used =====
@@ Line 107: / Line 127: @@
   * SQL Alchemy
   * NumPy
+===== Results =====
+==== Train on ESAC, Random Seed ====
+{{:example_rnd_01.png?800|}}
+{{:example_rnd_01.mp3|}}
+==== Train on ESAC, ESAC Seed, Probabilistic Sampling ====
+{{:example_rnd_02.png?800|}}
+{{:example_rnd_02.mp3|}}
+===== Next Steps =====
+  * Try out longer training and more epoches.
+  * Integrate harmony components.
+  * Cross-learn: Learn on ESAC and harmony from jazz etc.

HAMR@ISMIR 2015 Proceedings

User Tools

Site Tools

Page Tools