User Tools

Site Tools



Authors Sankalp Gulati, Kaustuv Kanti Ganguli, Swapnil Gupta, Ajay Srinivasamurthy
Affiliation Music Technology Group, Universitat Pompeu Fabra, Barcelona

Short summary

  • Goal: Light-weight real-time raga recognition on web-browser.
  • We start by performing real-time melody estimation and melodic transcription for Indian art Music.
  • Transcribed melody is used to determine salience values for ragas using three melodic aspects (that hierarchically constitutes a raga melody): svara (notes), svara transition and raga characteristic melodic phrases.
  • Svara transition and melodic phrase search is performed on a stored database in real-time and raga saliences are dynamically updated based on the found matches.
  • All the intermediate steps such as pitch contour, note transcription, and matched phrases are visualized in real-time.

Block diagram of the system

Screen-shot of the visualization

Exended description


We demonstrate a real-time raga recognition system capable of running on web-browsers. Our system follows a hierarchical approach that uses pitch class profiles, pitch transitions and melodic phrases for raga recognition. We process the input audio signal in real-time to estimate pitch, and subsequently perform melody transcription. For each raga we store a dictionary of its svaras, svara transitions, and typical melodic phrases. The likelihood of each raga is updated in real-time based on the transcribed melody. In order to highlight the melodic events that are characteristic of a raga, we perform a dynamic visualization of the evolution of the salience of all the ragas.

Importance of raga recognition task & the hierarchical model

The concept of raga in Indian art music (hereafter IAM) is quite complex and a comprehensive definition involves multiple dimensions including melody, rhythmicity, timbral texture etc. While raga recognition, by itself, is a very interesting and widely addressed task from both musicological and MIR researchers, we address the problem from a viewpoint that additionally leads to exploring the raga space through a gradual unfolding of the svaras. There exists a hierarchical model in the melodic framework of a raga, viz., (i) ground level: svaras that constitute the scale of the raga (with certain svaras being pseudo-steady for a longer duration), (ii) intermediate level: allowed svara transitions to make meaningful note sequences, and (iii) top level: characteristic phrases that is conclusive to recognise a raga.


  • Representation: The main highlight of the proposed work is real-time audio processing to achieve note-level transcription from an input live audio recording. We have implemented the pYIN algorithm (in Javascript) that estimates the predominant melody from the vocal music audio. We consider a simple representation of the melodic shape that features only the relatively stable regions of the continuous pitch contours that lie within a musically valid interval of a scale (raga) notes.
  • Likelihood computation: We store a dictionary of raga information comprising its svaras, svara transitions (a weight depending on how characteristic they are), and typical melodic phrases that are analogous to the musical hierarchy as aforementioned. The likelihood computation is a 3-stage process where we accumulate confidence values for these three hierarchical stages. We dynamically update the accumulated confidence value that is indicative of the most likely raga, based on the indices from the stored dictionary.
  • Visualization: We visualize the output of each processing stage, i.e., extracted pitch contour, transcribed melody as a sequence of string symbols, confidence values for the three hierarchical stages, and a bar-chart of 20 ragas (our current dataset) where the height of each bar is dynamically updated proportional to the raga salience (likelihood accumulated from three confidence values).


Our current system employs real-time pitch tracking on web browser, real-time melody transcription, dynamic raga recognition based on a hierarchical model of melodic descriptors. Apart from being an efficient raga recognition system, this facilitates a tool to explore the 'raga space' and discover insightful relationships among alied ragas which is otherwise not explicit. The proposed system would also find its use among advanced students of IAM in the pedagogical scenario where one could explore the raga space and appreciate the nuances of phrase progression while unfolding a raga.

ragawise.txt ยท Last modified: 2015/10/25 12:02 by kaustuvkanti