====== Differences ====== This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
realtime_solo-to-tutti_audio_alignment_separation-by-humming_for_realtime_karaoke_generation [2014/10/26 04:13] maezawa created |
realtime_solo-to-tutti_audio_alignment_separation-by-humming_for_realtime_karaoke_generation [2014/10/26 05:21] (current) maezawa |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Karaoke on Your Favorite Recording... without a SMF! ====== | ||
+ | |||
+ | ====== Background ====== | ||
+ | Automatic accompaniment (e.g. a karaoke that follows your playing) kicks ass. | ||
+ | It further kicks ass when the karaoke track is generated from that favorite recording of yours, | ||
+ | with the soloist separated out. | ||
+ | |||
+ | One thing that bugs me, though, is that existing | ||
+ | methods require digital score data (e.g. standard MIDI file, musicXML, etc.). | ||
+ | Preparing a SMF is annoying, so I want an accompaniment system that does not need SMFs to | ||
+ | work. | ||
+ | SMF is used for two purposes: (1) Making a karaoke track from your favorite recording (informed source separation), (2a) Tracking where you are playing in the music (score following), and (2b) Understanding which part in the karaoke | ||
+ | track the system should be playing back (offline alignment). | ||
+ | |||
+ | So, my goal is to circumvent the use of SMF for (1) generating the karoake track, and (2) synchronizing your playing to the karaoke track. | ||
+ | |||
====== Problem statement ====== | ====== Problem statement ====== | ||
+ | |||
I basically want to (1) load a favorite violin concerto, (2) play the violin concerto on my violin, then (3) the track from (1) plays in sync with me, with the violin solo part separated out. | I basically want to (1) load a favorite violin concerto, (2) play the violin concerto on my violin, then (3) the track from (1) plays in sync with me, with the violin solo part separated out. | ||
Line 32: | Line 49: | ||
The HMM is left-to-right, allowing a current state to (1) stay in the same state, (2) advance to next state. | The HMM is left-to-right, allowing a current state to (1) stay in the same state, (2) advance to next state. | ||
The key here is that the overlap used for computing **X** is smaller than that used for **U**. | The key here is that the overlap used for computing **X** is smaller than that used for **U**. | ||
- | This way, the left-to-right architecture permits the user to play faster than **X**. | + | For example, **X** is computed at 10 frames per second, whereas **U** is computed at 50 frames per second. |
+ | This way, the left-to-right architecture permits the user to play faster than **X**, and the number of states becomes manageable enough to accommodate for a moderately long piece of music. | ||
Aside 1: Elaborate schemes using semi-HMM wasn't worth the effort, at least for simple duration pdf. | Aside 1: Elaborate schemes using semi-HMM wasn't worth the effort, at least for simple duration pdf. | ||
- | Aside 2: I tried first modeling the state dynamics using Particle filter (like Montecchio2011, Ohtsuka2011) | + | Aside 2: I tried first modeling the state dynamics using Particle filter but it didn't quite work. With finite particles, once it gets "stuck," simple proposal distribution |
- | but it didn't quite work. With finite particles, once it gets "stuck," simple proposal distribution | + | |
is insufficient to recover the right position. | is insufficient to recover the right position. | ||
Line 65: | Line 83: | ||
In implementation, I prepared a few "detuned" version of **X**(s) as well, as to compensate for small tuning variations. | In implementation, I prepared a few "detuned" version of **X**(s) as well, as to compensate for small tuning variations. | ||
- | |||
- |