The cancellation principle in auditory scene analysis
Alain de Cheveigné
(Centre National de la Recherche Scientifique/Institut de Recherche et Coordination Acoustique/Musique, Paris, France)

Accounts of Auditory Scene Analysis (natural or computational) usually assume a spectro-temporal representation produced by a filter bank or short-term Fourier transform. This 2D "map" is divided into "pixels" that are assigned to different sources based on criteria such as harmonicity, common onset, statistical models, etc. The pixels are not divided further. In my talk I will discuss a different separation scheme by which temporal structure (periodicity or cross-channel correlation) is used to cancel individual sources. This scheme is distinct from the previous in that separation is not determined by spectral analysis. They are however complementary, and can be used together for example by applying cancellation within individual bands of a filter bank. The scheme has been applied effectively to the task of fundamental frequency estimation of one or several voices. It can be applied in principle to source separation. In the multichannel case it is closely related to Independent Component Analysis. There is some evidence that the auditory system applies the cancellation principle to exploit harmonic structure and interaural correlation.

Relevant material:

de Cheveigné A, and Baskind A (2003) F0 estimation of one or several voices. Proc. Eurospeech, 833-836. (http://www.ircam.fr/pcm/cheveign/ps/2003_eurospeech.pdf)
de Cheveigné A, and Kawahara H (2002) YIN, a fundamental frequency estimator for speech and music. J. Acoust. Soc. Am. 111:1917-1930. (http://www.ircam.fr/pcm/cheveign/pss/2002_JASA_YIN.pdf)
de Cheveigné A (2001) Correlation Network model of auditory processing. Proc. Workshop on Consistent & Reliable Acoustic Cues for Sound Analysis, Aalborg (Denmark). (http://www.ircam.fr/pcm/cheveign/ps/2001_CRAC.pdf)
de Cheveigné A (2001) The auditory system as a separation machine. In Breebaart J, Houtsma AJM, Kohlrausch A, Prijs VF and Schoonhoven R (eds) Physiological and Psychophysical Bases of Auditory Function. Maastricht, The Netherlands: Shaker Publishing BV, 453-460. (http://www.ircam.fr/pcm/cheveign/pss/2000_ISH.pdf)