Speech Separation and Comprehension in Complex Acoustic Environments
Speech against the background of multiple, different speech sources, such as crowd noise or even a single talker in a reverberant environment, has been recognized as the acoustic setting perhaps the most detrimental to verbal communication. Psychological and audiological data over the last 25 years have succeeded in better defining the processes necessary for a human listener to perform this difficult task. The same data have also motivated the development of models that have been able to better and better predict and explain human performance in a multi-talker setting. However, since the data gave indication of the limits of performance under these difficult listening conditions, it became clear that significant improvement of speech understanding in speech noise is likely to be brought about only by yet-to-be-developed devices that execute automatic separation of speech sources, filter out the unwanted sources, and enhance the target source. The last 10-15 years have allowed us to witness an unprecedented rush toward the development of different computational schemes aimed at achieving this goal. A cursory survey of computational separation of speech from other acoustic signals, mainly other speech, strongly suggests that the current state of the whole field is in a flux: there are a number of initiatives, each based on an even larger number of theories, models, and assumptions. It seems that, despite commendable efforts and achievements by many researchers, it is not clear where the field is going. One possible problem is that investigators working in separate areas seldom interact.
In order to foster such an interaction, we organized an interdisciplinary international workshop last year, to our knowledge the first of its kind. We invited experimental psychologists, neuroscientists, and computer scientists working on different issues of speech separation problem and using different techniques. The workshop, held in Montreal, Canada, over the weekend of October 31 to November 2, 2003, was sponsored by the National Science Foundations Directorate for Computer and Information Science and Engineering, Division of Intelligent Information Systems, Program of Human Language and Communication. It was attended by twenty active presenters who constituted a representative sample of experts of speech separation researchers from the various fields. Interspersed with presentations of the experts work, there were periods of planned discussion that stimulated an intensive exchange of ideas and points of view. A book with contributions by all presenters, published by Kluwer Academic Publishers, is just about to appear (publication date: October 15, 2004). But we must also notice that the field is rapidly changing, as we speak. For one, the question of how the separated signals are interpreted, by humans and machines, is being increasingly thrust to the forefront. To keep abreast of all these changes, it will be necessary for experts working on speech separation and comprehension by humans and machines to meet again at another workshop to exchange data and ideas.
Similarly to the first, this workshop will be small: it is our desire to give each invited attendee the opportunity to present his or her views, as a presenter, a discussant, or both. The format will be that of presentations followed by both topical and general discussion periods. As a departure from last years format, we want to open the workshop to a select group of ten graduate students and postdoctoral attendees who will display their posters that will be accessible throughout the duration of the workshop. These young participants will be also encouraged to take an active part in the discussion periods.
In addition to the established and young presenters, representatives from U.S. funding agencies and funding agencies from Europe, Canada, and Japan have also been invited as observers. The reason for inviting these representatives is to stimulate interest in the interdisciplinary problem area of speech separation and comprehension.
Objectives of the workshop
Topics to be covered
Pierre Divenyi, EBIRE, Martinez, CA (Chair)