LabROSA :
Projects :
The Listening Machine Project
"The Listening Machine - Sound Source Organization for Multimedia Understanding" is an NSF-funded project at
LabROSA
concerned with separating and recognizing acoustic sources in complex,
real-world mixtures. This web page is a central hub for materials
resulting from this project.
Project reports
Theses
Publications
2008
2007
-
D. Ellis (2007)
Beat Tracking by Dynamic Programming
J. New Music Research, Special Issue on Beat and Tempo Extraction, vol. 36 no. 1, March 2007, pp. 51-60. (10pp)
DOI: 10.1080/09298210701653344
-
C. Smit and D. Ellis (2007)
Solo voice detection via optimal cancelation
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, Mohonk NY, October 2007, pp. 207-210.
-
G. Poliner and D. Ellis (2007)
Improving generalization for polyphonic piano transcription
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, Mohonk NY, October 2007, pp. 86-89.
-
S.-F. Chang, D. Ellis, W. Jiang, K. Lee, A. Yanagawa, A. Loui, J. Luo (2007)
Large-scale multimodal semantic concept detection for consumer video
Multimedia Information Retrieval workshop, ACM Multimedia Augsburg, Germany, Sep 2007, pp. 255-264.
DOI: 10.1145/1290082.1290118
-
M. Mandel and D. Ellis (2007)
A Web-Based Game for Collecting Music Metadata
Proc. Int. Conf. on Music Info. Retrieval ISMIR-07 Vienna, Austria, pp. 365-366.
-
D. Ellis (2007)
Classifying Music Audio with Timbral and Chroma Features
Proc. Int. Conf. on Music Info. Retrieval ISMIR-07 Vienna, Austria, pp. 339-340.
-
A. Doherty, A. Smeaton, K.-S. Lee, and D. Ellis (2007)
Multimodal Segmentation of Lifelog Data
Proc. 8th Int. Conf. on Computer-Assisted Information Retrieval RIAO 2007, Pittsburgh, May 2007. (18pp)
-
G. Poliner, D. Ellis, A. Ehmann, E. Gómez, S. Streich, B. Ong (2007)
Melody Transcription from Music Audio: Approaches and Evaluation
IEEE Tr. Audio, Speech, Lang. Proc., vol. 14 no. 4, May 2007, pp. 1247-1256. (10pp)
-
J. Ogle and D. Ellis (2007)
Fingerprinting to Identify Repeated Sound Events in Long-Duration Personal Audio Recordings
Proc. ICASSP-07 Hawai'i, pp.I-233-236. (4pp)
-
D. Ellis and G. Poliner (2007)
Identifying Cover Songs With Chroma Features and Dynamic Programming Beat Tracking
Proc. ICASSP-07 Hawai'i, pp. IV-1429-1432. (4pp)
2006
-
M. Mandel, D. Ellis, and T. Jebara (2006)
An EM algorithm for localizing multiple sound sources in reverberant environments
Advances Neural Info. Proc. Sys. 19, Vancouver CA, Dec 2006, pp. 953-960. (8pp)
-
R. Weiss and D. Ellis (2006)
Estimating single-channel source separation masks: Relevance Vector Machine classifiers vs. pitch-based masking
Proc. Workshop on Statistical and Perceptual Audition SAPA-06, pp. 31-36, Pittsburgh PA, Oct 2006. (6pp)
-
K. Lee and D. Ellis (2006)
Voice Activity Detection in Personal Audio Recordings Using Autocorrelogram Compensation
Interspeech ICSLP-06, pp. 1970-1973, Pittsburgh, Oct 2006. (4pp)
-
G. Poliner and D. Ellis (2006)
A Discriminative Model for Polyphonic Piano Transcription
Eurasip Journal of Advances in Signal Processing, special issue on Music Signal Processing, 2007 (2007), Article ID 48317. (9pp)
DOI: 10.1155/2007/48317
-
X. Halkias and D. Ellis (2006)
Estimating the Number of Marine Mammals using Recordings of Clicks from One Microphone
Proc. ICASSP-06, Toulouse, May 2006, pp. V-769-772. (4pp).
-
D. Ellis and R. Weiss (2006)
Model-Based Monaural Source Separation Using a Vector-Quantized Phase-Vocoder Representation
Proc. ICASSP-06, Toulouse, May 2006, pp. V-957-960. (4pp)
-
D. Ellis and G. Poliner (2006)
Classification-Based Melody Transcription
Machine Learning, special issue on Machine Learning In and For Music, vol. 65 no. 2-3, Dec 2006, pp. 439-456. (18pp)
DOI: 10.1007/s10994-006-8373-9
-
D. Ellis and K. Lee (2006)
Accessing minimal-impact personal audio archives
IEEE MultiMedia, vol. 13 no. 4, Oct-Dec 2006, pp. 30-38. (9pp)
2005
-
K. Dobson, B. Whitman, D. Ellis (2005)
Learning Auditory Models of Machine Voices
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-05, Mohonk NY, October 2005, pp. 339-342. (4pp)
-
G. Poliner, D. Ellis (2005)
A Classification Approach to Melody Transcription
Proc. Int. Conf. on Music Info. Retrieval ISMIR-05, London, September 2005, pp.161-166. (6pp)
-
M. Mandel, D. Ellis (2005)
Song-Level Features and Support Vector Machines for Music Classification
Proc. Int. Conf. on Music Info. Retrieval ISMIR-05, London, September 2005, pp.594-599. (6pp)
- N. Lesser, D. Ellis (2005).
Clap Detection and Discrimination for Rhythm Therapy
Proc. ICASSP-05, Philadelphia, March 2005. (4pp)
- M. Reyes-Gomez, N. Jojic, and D. Ellis (2005).
Deformable Spectrograms
AI & Statistics 2005, Barbados, Jan 2005. (8pp)
2004
- D. Ellis and K.S. Lee (2004).
Minimal-Impact Audio-Based Personal Archives
First ACM workshop on Continuous Archiving and Recording of Personal Experiences CARPE-04, New York, Oct 2004. (6pp)
- M. Reyes-Gomez, N. Jojic, and D. Ellis (2004).
Towards single-channel unsupervised source separation of speech mixtures: The layered harmonics/formants separation-tracking model
Workshop on Statistical and Perceptual Audio Processing SAPA'04, Jeju, Korea, Oct 2004. (6pp)
- M.J. Reyes-Gomez, N. Jojic and D. Ellis (2004).
Detailed graphical models for source separation and missing data interpolation in audio. Snowbird Learning Workshop, Snowbird, 2004. (2pp)
- D. Ellis and K.S. Lee (2004).
Features for Segmenting and Classifying Long-Duration Recordings of Personal Audio
Workshop on Statistical and Perceptual Audio Processing SAPA'04, Jeju, Korea, Oct 2004. (6pp)
- M.J. Reyes-Gomez, D. Ellis and N. Jojic (2004).
Multiband Audio Modeling for Single Channel Acoustic Source Separation. To appear in Proc. ICASSP-04, Montreal, May 2004. (4pp)
-
D. Ellis and J. Arroyo (2004).
Eigenrhythms: Drum pattern basis sets for classification and generation
International Symposium on Music Information Retrieval ISMIR-04, Barcelona, Oct 2004, pp. 101-106. (6pp)
(longer tech report version with color figures)
2003
- M.J. Reyes-Gomez, B. Raj and D. Ellis (2003).
Multi-channel Source Separation by Beamforming Trained with Factorial HMMs.
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous.
and Audio, Mohonk NY, October 2003. (4pp)
- M.J. Reyes-Gomez & D.P.W. Ellis (2003).
Selection, Parameter Estimation, and Discriminative Training of Hidden Markov Models for General Audio Modeling.
Proc. ICME-03, Baltimore, July 2003. (4pp)
- M.J. Reyes-Gomez, B. Raj, and D. Ellis (2003).
Multi-channel Source Separation by Factorial HMMs.
Proc. ICASSP-03, Hong Kong, April 2003. (4pp)
Web materials
Acknowledgment
This material is based in part upon work supported by the National
Science Foundation under Grant No. IIS-0238301. Any opinions, findings
and conclusions or recommendations expressed in this material are those
of the author(s) and do not necessarily reflect the views of the
National Science Foundation (NSF).
Last updated: $Date: 2009/06/05 16:10:53 $
Dan Ellis <dpwe@ee.columbia.edu>