Signal separation motivated by human auditory perception
Richard Stern
(Carnegie Mellon University)
The human auditory system uses a number of well-identified cues to segregate and separate individual sound sources in a complex acoustical environment. For example, researchers in auditory scene analysis have long identified cues for such as common onset, correlated fluctuations in instantaneous amplitude and frequency, harmonicity, and common interaural time and amplitude differences as ways of identifying which components of a complex signal are derived from a common source. It is widely believed that the use of these cues to achieve such "grouping" and signal separation should be very useful in improving the accuracy of automatic speech recognition in very difficult environments such as competing speech, background music, and transient noise, and this has been a goal of several research groups in computational auditory scene analysis. In this talk I will describe and discuss ways in which signals can be separated using physiologically-motivated cues, along with the potential benefit to be derived from such separation for automatic speech recognition.
Relevant Material:
Stern, R. M. (2002). "Using Computational Models of Binaural Hearing to Improve Automatic Speech Recognition: Promise, Progress, and Problems," AFOSR Workshop on Computational Audition. http://www.cs.cmu.edu/~rms/BinauralWeb/papers/BinauralASR.pdf
Stern, R. M. and Sullivan, T. M. (1995) "Robust Speech Recognition Based on Human Binaural Perception," Proc. of the ATR workshop on A Biological Framework for Speech Perception and Production, Kansai Science City, September, 1994, Reprinted as the ATR Technical Report TR-H-121, (1995). http://www.cs.cmu.edu/~robust/Papers/ATR94.pdf
Stern, R. M. and Trahiotis, C. (1995). "Models of Binaural Interaction," Chapter in Handbook of Perception and Cognition, Volume 6: Hearing, pp. 347-386, B. C. J. Moore, Ed. New York: Academic Press. http://www.cs.cmu.edu/~rms/BinauralWeb/papers/modchap.pdf
Other Material:
http://www.cs.cmu.edu/~rms/BinauralWeb/
http://www.cs.cmu.edu/~robust/