Columbia EE :
LabROSA :
|
Examples of Speech Separation Challenges
|
The task of separating speech in complex acoustic environments -- such
as a single voice in a cocktail party -- is an extremely difficult
challenge. This page collects some examples of the kind of acoustic
scenes that might, in theory, allow for separation of speech
from extreme background interference. The emphasis is on real-world
recordings i.e. the kinds of situations encountered in daily life.
These examples are generally around the limit of normal listeners'
abilities to discern particular voices.
Sound Examples
This project is in its initial phase of defining the tasks and
approaches.
Here are some introductory illustrations of the kinds of problems we
are facing:
- Coffee Shop:
overlapping conversations recorded by a body-worn mic in a coffee shop
(60 sec, 16 kHz mono WAV, 1.9 MB)
- Playground: voices and other ambient
sounds recorded by a body-worn mic in a playground (60 sec, 16 kHz mono WAV,
1.9 MB)
- Meeting: excerpt from a real
multiparty meeting, recorded by a pair of mics on the tabletop
(30 sec, 16 kHz stereo WAV, 1.9 MB)
- Street noise: Man talking on the phone passes by on the sidewalk alongside a busy Manhattan street. Binaural recording (30 sec, 44.1 kHz 128 kbps joint-stereo MP3, 476 kB)
- Cafeteria noise: From a quick trip to the Business School cafeteria to pick up lunch. Binaural recording (30 sec, 44.1 kHz 160 kbps joint-stereo MP3, 592 kB)
Last updated: $Date: 2005/08/09 03:26:12 $
Dan Ellis <dpwe@ee.columbia.edu>