|Authors||Ajay Srinivasamurthy, Swapnil Gupta, Sankalp Gulati, Kaustuv Kanti Ganguli|
|Affiliation||Music Technology Group, UPF, Barcelona|
This hack is a sawaal-jawaab (call-response) machine improvisation system for vocals and tabla. Tabla is the primary percussion instrument used in Hindustani music and is characterized by onomatopoeic vocal syllables that represent the different strokes played on it. The syllables, called the bols, are used as mnemonics in music learning (oral tradition). Additionally, reciting these syllables during a tabla solo performance is common and is in itself an art form.
During Hindustani music concerts, it is common to have a call-response improvisatory passages between musicians, called a sawaal-jawaab (literally, question-answer). It is also common in tabla solos to have a sawaal-jawaab between a musician reciting vocal syllables and a response by the musician playing the tabla. We explore such an improvisation in this hack - with the call being the vocal recitation of syllables. The response is an improvisation of the call built using timing, rhythmic and timbral features from the call, exploiting the onomatopeioc nature of the tabla bols. Such an improvisation is done within the framework of a specific taal, the rhythmic framework of Hindustani music.
The complete response each time mainly has three parts - a theka (basic pattern of the taal), followed by a response to the call, and cycle long tihai (a polyrhythmic concluding phrase). At present, only one theka and tihai are used per taal. You can choose four different taals (cycle lengths in parentheses): teentaal (16 units), ektaal (12 units), jhaptaal (10 units), and rupak (7 units).
An example sawaal-jawaab in a concert between vocal melody and tabla is here: https://youtu.be/SSpHRcQCO8Q?t=1274
From the audio recording of the call, the algorithm extracts beat aligned audio features such as MFCC and energy. The features are used to do a basic onset detection, and build a basic timbre model for the beat. Tabla stroke samples are selected from a pool based on these MFCC, onset and energy features and a response is generated. The response is randomized using some probability thresholds to make it sound natural. Most common rolls on the tabla are also used in highly dense regions of the call.