[melody image] Polyphonic Melody Extraction

Transcribing real music recordings into score is a difficult problem; one simplification is to search for only a single dominant melody line within a piece of music - such as the lead vocal in pop songs. We have been pursuing a novel classification-based approach to this task; this page links to some of our results, as well as some of the ground-truth data we have prepared in this project.


We created ground-truth annotation for a range of music excerpts as part of the 2005 MIREX Audio Melody Extraction evaluation (see also the MIREX 05 Melody results). Below are two representative training sets for algorithm development.

The first column of the REF files represents the time axis in seconds, and the second column is the f0 melody transcription in Hz. The corresponding .wav files are (44.1 kHz, MONO, 16 bit PCM). The LabROSA training set includes audio files contributed by Emmanuel Vincent.


LabROSAmelodyextract2005.tgz (99MB) is a gzipped tar file containing our melody extraction system as submitted to MIREX 2005. It runs on a Linux (or MacOS) system and requires both Matlab and Java.


Here are some of our publications that relate to this data:

G. Poliner, D. Ellis (2005). A Classification Approach to Melody Transcription
Proc. Int. Conf. on Music Info. Retrieval ISMIR-05, London, September 2005. (6pp)


This material is based in part upon work supported by the National Science Foundation under Grant No. IIS-0238301. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).

