Speech Separation and Comprehension in Complex Acoustic Environments |
||||||||
Combating the Reverberation ProblemChair: Barbara Shinn-Cunningham Presenters: DeLiang Wang, Jay Desloge, Martin Cooke (NB: Talk titles link to presentation slides.) Outline
How humans cope in natural settingsIn natural settings, reverberation changes many aspects of the sound reaching our ears (e.g., smearing out temporal structure, altering the short-term spectral content, distorting spatial cues in the signals, etc.). Despite the sometimes drastic distortions caused by reverberant energy, we can usually extract the "true" information in the signals we hear, including the content of any speech messages, the location of the sound sources, and information about the environment itself. After a brief discussion of how reverberation influences the signals we hear, I will discuss snippets of data that suggest that in order to overcome distortion from reverberant energy. we calibrate to our environment, changing how we process and interpret sound based on the distortion we expect to hear. The ensuing discussion should address how our brains cope with natural settings as well as how the neural approach is similar to and differs from the approaches used in most machine algorithms.
Papers:
How speech is corrupted by reverberationEven in environments containing a single acoustic source, the presence of reverberant energy has a detrimental effect on the speech signal. Reverberation reduces the salience of numerous spectro-temporal features which help to define important phonetic distinctions, and undermines the effectiveness of binaural information. However, things get much worse when several sound sources are present. Not only does the additional non-direct energy constitute an additional masker for each source, but reverberation also blurs many of the cues (eg f0 dynamics) which are held to aid in source segregation. In this talk, I will use examples of speech in moderate and severe reverberation to demonstrate both pictorially and quantitatively the corruption of speech features and cues to perceptual organisation in single and multisource environments.This is joint with Barbara Shinn-Cunningham and Madhusudana Shashanka. Effects of reverberation on pitch, onset/offset, and binaural cuesTo combat the reverberation problem computationally, one must understand effects of reverberation on commonly used auditory features. I will present preliminary observations on reverberation-caused distortions on estimating pitch, onset and offset, and interaural time/intensity differences for natural speech. I will also discuss their implications for segregation of reverberant speech. Papers:
Multi-microphone Source Separation in Reverberant EnvironmentsMulti-microphone source separation systems form the final system outputs by using linear filters to process and combine microphone signals. Although the specific filters can be generated in a variety of ways, most techniques (including adaptive spatial filtering and ICA) separate a particular source from the environment using filters designed to preserve the desired source with unit gain while attenuating the remaining sources. Recent research involving directional hearing-aid performance indicates that spatial filtering techniques using these structures provide quite small benefits in realistically reverberant environments. Substantial source separation (even given exact knowledge of source-to-microphone propagation for all sources) requires filters that are often several times longer than the reverberation time of the environment. Such long filters are not only computationally demanding but they are also impractical in time-varying environments. In this session, I will discuss performance-limiting factors of these multi-microphone architectures in reverberant environments and will suggest means of improving their operation. Papers:
|