Columbia EE :
Separating Speech from Speech Noise
The task of separating speech in complex acoustic environments -- such
as a single voice in a cocktail party -- is an extremely difficult
challenge. Many speech enhancement or separation techniques cannot
accommodate the situation when both target and interference have
the same properties, because both are speech. This project is
concerned with applying some novel models -- using Computational
Auditory Scene Analysis (CASA) and trained models of the speech
signal -- to see how well speech can be separated. In particular,
our goal is to provide separations that are demonstrably of benefit
to human listeners, hence our collaboration with perceptual
experimentalists at EBIRE and Boston University.
R. Weiss and D. Ellis (2008)
Speech separation using speaker-adapted Eigenvoice speech models
Computer Speech and Language, accepted for publication. (18pp)
R. Weiss, M. Mandel, D. Ellis (2008)
Source Separation Based on Binaural Cues and Source Model Constraints
Proc. Interspeech-08, pp. 419-422, Brisbane, Australia, September 2008.
K. Hu, P. Divenyi, D. Ellis, Z. Jin, B. Shinn-Cunningham, D. Wang (2008)
Preliminary Intelligibility Tests of a Monaural Speech Segregation System
Proc. SAPA-08, pp. 11-16, Brisbane, Australia, September 2008.
A. Lammert, D. Ellis, P. Divenyi (2008)
Data-driven articulatory inversion incorporating articulator priors
Proc. SAPA-08, pp. 29-34, Brisbane, Australia, September 2008.
S. Ravuri and D. Ellis (2008)
Stylization of Pitch with Syllable-Based Linear Segments
Proc. ICASSP-08 Las Vegas, April 2008, pp. 3985-3988.
M. Mandel and D. Ellis (2007)
EM localization and separation using interaural level and phase cues
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, Mohonk NY, October 2007, pp. 275-278.
R. Weiss and D. Ellis (2007)
Monaural speech separation using source-adapted models
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, Mohonk NY, October 2007, pp. 114-117.
M. Athineos and D. Ellis (2007)
Autoregressive Modeling of Temporal Envelopes
IEEE Tr. Signal Processing, vol. 15 no. 11, Nov 2007, pp. 5237-5245. (9pp)
R. Weiss and D. Ellis (2006)
Estimating single-channel source separation masks: Relevance Vector Machine classifiers vs. pitch-based masking
Proc. Workshop on Statistical and Perceptual Audition SAPA-06, pp. 31-36, Pittsburgh PA, Oct 2006. (6pp)
D. Ellis and R. Weiss (2006)
Model-Based Monaural Source Separation Using a Vector-Quantized Phase-Vocoder Representation
Proc. ICASSP-06, Toulouse, May 2006, pp. V-957-960. (4pp)
M. Mandel, D. Ellis, and T. Jebara (2006)
An EM algorithm for localizing multiple sound sources in reverberant environments
Proc. Neural Info. Proc. Sys., Vancouver CA, Dec 2006. (8pp)
M. Mandel and D. Ellis (2006)
A probability model for interaural phase difference
Proc. Workshop on Statistical and Perceptual Audition SAPA-06, pp. 1-6, Pittsburgh PA, Oct 2006. (6pp)
D. Ellis (2006)
Model-Based Scene Analysis
Chapter 4 of Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, D. Wang & G. Brown, eds., Wiley/IEEE Press, pp. 115-146, 2006. (46pp)
This material is based in part upon work supported by the National
Science Foundation under Grant No. IIS-05-35168. Any opinions, findings
and conclusions or recomendations expressed in this material are those
of the author(s) and do not necessarily reflect the views of the
National Science Foundation (NSF).
Last updated: $Date: 2005/08/09 03:26:12 $
Dan Ellis <firstname.lastname@example.org>