User Tools

Site Tools



Authors Ajay Srinivasamurthy, Swapnil Gupta, Sankalp Gulati, Kaustuv Kanti Ganguli
Affiliation Music Technology Group, UPF, Barcelona



This hack is a sawaal-jawaab (call-response) machine improvisation system for vocals and tabla. Tabla is the primary percussion instrument used in Hindustani music and is characterized by onomatopoeic vocal syllables that represent the different strokes played on it. The syllables, called the bols, are used as mnemonics in music learning (oral tradition). Additionally, reciting these syllables during a tabla solo performance is common and is in itself an art form.

During Hindustani music concerts, it is common to have a call-response improvisatory passages between musicians, called a sawaal-jawaab (literally, question-answer). It is also common in tabla solos to have a sawaal-jawaab between a musician reciting vocal syllables and a response by the musician playing the tabla. We explore such an improvisation in this hack - with the call being the vocal recitation of syllables. The response is an improvisation of the call built using timing, rhythmic and timbral features from the call, exploiting the onomatopeioc nature of the tabla bols. Such an improvisation is done within the framework of a specific taal, the rhythmic framework of Hindustani music.


  • Select the taal (metrical cycle) from the dropdown menu. Use the tempo slider to set a tempo you are comfortable with. There are both audio and visual cues of the progression through the cycle. There are audio clicks marking the sections of the cycle, with the downbeat (called the sam) having a different click. The downbeat is shown with a red dot, with the beats blinking in green.
  • Click and hold down the record button, the system starts recording from the next downbeat and records everything you say, as your call.
  • Release the record button and wait for the response. The response is also aligned with the downbeat and starts on the next cycle.
  • Repeat as many times you want, improvise with the system!

The complete response each time mainly has three parts - a theka (basic pattern of the taal), followed by a response to the call, and cycle long tihai (a polyrhythmic concluding phrase). At present, only one theka and tihai are used per taal. You can choose four different taals (cycle lengths in parentheses): teentaal (16 units), ektaal (12 units), jhaptaal (10 units), and rupak (7 units).


An example sawaal-jawaab in a concert between vocal melody and tabla is here:

Improvisation Algorithm

From the audio recording of the call, the algorithm extracts beat aligned audio features such as MFCC and energy. The features are used to do a basic onset detection, and build a basic timbre model for the beat. Tabla stroke samples are selected from a pool based on these MFCC, onset and energy features and a response is generated. The response is randomized using some probability thresholds to make it sound natural. Most common rolls on the tabla are also used in highly dense regions of the call.

Dataset and Resources

  • The dataset consists of about 200 samples spanning 14 different strokes of tabla, with anywhere between 10-20 samples for each stroke type.
  • Essentia, an open-source C++ library for audio analysis and audio-based music information retrieval is used to extract audio features.
  • Audio analysis is done in python, and generation in javascript
sawaal-jawaab.txt ยท Last modified: 2015/11/16 15:20 by ajaysm