Journal papers
  - 
      J. Salamon, E. Gomez, D. Ellis, G. Richard (2014)
 
    Melody Extraction from Polyphonic Music Signals 
    IEEE Signal Processing Magazine, pp.118-134, March 2014. 
    DOI: 10.1109/MSP.2013.2271648
    
 
  
    - 
    J. Devaney, M. Mandel, D. Ellis, I. Fujinaga (2011)
 
    Automatically extracting performance data from recordings of trained singers 
    Psychomusicology: Music, Mind & Brain 21(1-2), pp. 108-136, 2011.
    
    
 
  
    - 
    G. Grindlay and D. Ellis (2011)
 
    Transcribing Multi-instrument Polyphonic Music with Hierarchical Eigeninstruments 
    IEEE J. Sel. Topics in Sig. Process., vol. 5 no. 6, pp. 1159-1169, October 2011. 
    DOI: 10.1109/JSTSP.2011.2162395.
    
 
  
    - 
    M. Mueller, D. Ellis, A. Klapuri, and G. Richard (2011)
 
    Signal Processing for Music Analysis 
    IEEE J. Sel. Topics in Sig. Process., vol. 5 no. 6, pp. 1088-1110, October 2011. 
    DOI: 10.1109/JSTSP.2011.2112333.
    
 
  
    - 
    M. Mueller, D. Ellis, A. Klapuri, G. Richard, and S. Sagayama (2011)
 
    Introduction to the Special Issue on Music Signal Processing 
    IEEE J. Sel. Topics in Sig. Process., vol. 5 no. 6, pp. 1085-1087, October 2011. 
    DOI: 10.1109/JSTSP.2011.2165109.
    
 
  
    - 
    R. Weiss, M. Mandel, and D. Ellis (2011)
 
    Combining localization cues and source model constraints for binaural source separation 
    Speech Communication, vol. 53 no. 5, pp. 606-621, May 2011. 
    DOI: 10.1016/j.specom.2011.01.003
    
 
  
    - 
    M. Mandel, S. Bressler, B. Shinn-Cunningham, and D. Ellis (2010)
 
    Evaluating Source Separation Algorithms With Reverberant Speech 
    IEEE Tr. Audio, Speech, and Lang. Proc., vol. 18 no. 7, pp. 1872-1883, September 2010. 
    DOI: 10.1109/TASL.2010.2052252
    
 
  
    - 
    K. Lee and D. Ellis (2010)
 
    Audio-Based Semantic Concept Classification for Consumer Video 
    IEEE Tr. Audio, Speech and Lang. Proc. vol. 18 no. 6 pp. 1406-1416, Aug. 2010. 
    DOI: 10.1109/TASL.2009.2034776
    
 
  
    - 
    M. Mandel, R. Weiss, and D. Ellis (2010)
 
    Model-Based Expectation-Maximization Source Separation and Localization 
    IEEE Tr. Audio, Speech, and Lang. Proc., vol. 18 no. 2, pp. 382-394, February 2010. 
    DOI: 10.1109/TASL.2009.2029711
    
 
  
    - 
    R. Weiss and D. Ellis (2010)
 
      Speech separation using speaker-adapted eigenvoice speech models 
      Computer Speech and Language, vol. 24 no. 1 pp. 16-29, Jan 2010. 
      DOI: 10.1016/j.csl.2008.03.003
    
 
  
    - 
    J. H. Jensen, M. G. Christensen, D. P. W. Ellis, and S. H. Jensen (2009)
 
    Quantitative analysis of a common audio similarity measure 
    IEEE Tr. Audio, Speech, Lang. Proc., vol. 17 no. 4 pp. 693-703, May 2009.
 
  
    - 
    M. Mandel and D. Ellis (2008)
 
    A Web-based Game for Collecting Music Metadata 
    J. New Music Research, vol. 37 no. 2, pp. 151-165, 2008.
 
  
    - 
    J. Devaney and D. Ellis (2008)
 
    An Empirical Approach to Studying Intonation Tendencies in Polyphonic Vocal Performances  
    J. Interdisc. Music Studies, vol. 2 no. 1-2, Spring/Fall 2008, pp. 141-156. (16pp)
 
  
    - 
    M. Slaney, D. Ellis, M. Sandler, M. Goto, M Goodwin (2008)
 
      Introduction to the Special Issue on Music Information Retrieval 
      IEEE Tr. Audio, Speech, Lang. Proc. vol. 16 no. 2, Feb 2008, pp. 253-254. (2pp)
 
  
    - 
    M. Athineos and D. Ellis (2007)
 
      Autoregressive Modeling of Temporal Envelopes 
      IEEE Tr. Signal Processing, vol. 15 no. 11, Nov 2007, pp. 5237-5245. (9pp)
 
  
    - 
    G. Poliner, D. Ellis, A. Ehmann, E. Gómez, S. Streich, B. Ong (2007)
 
    Melody Transcription from Music Audio: Approaches and Evaluation 
    IEEE Tr. Audio, Speech, Lang. Proc., vol. 14 no. 4, May 2007, pp. 1247-1256. (10pp)
 
  
    - 
    D. Ellis (2007)
 
    Beat Tracking by Dynamic Programming 
    J. New Music Research, Special Issue on Beat and Tempo Extraction, vol. 36 no. 1, March 2007, pp. 51-60. (10pp) 
    DOI: 10.1080/09298210701653344
 
  
    - 
    P. Scanlon, D. Ellis, R. Reilly (2007)
 
      Using Broad Phonetic Group Experts for 
Improved Speech Recognition 
      IEEE Tr. Audio, Speech, Lang. Proc., vol. 15 no. 3, March 2007, pp. 803-812. (10pp)
 
  
    - 
    D. Ellis and K. Lee (2006)
 
      
      Accessing minimal-impact personal audio archives 
    IEEE MultiMedia, vol. 13 no. 4, Oct-Dec 2006, pp. 30-38. (9pp)
 
  
    - 
    G. Poliner and D. Ellis (2006)
 
      A Discriminative Model for Polyphonic Piano Transcription 
    Eurasip Journal of Advances in Signal Processing, special issue on Music Signal Processing, 2007 (2007), Article ID 48317. (9pp) 
    DOI: 10.1155/2007/48317
 
  
    - 
    D. Ellis (2006)
 
      Extracting Information from Music Audio 
    Communications of the ACM, invited paper, special issue on Music Information Retrieval, vol. 49 no. 8, August 2006, pp.32-37. (6pp)
 
  
    - 
    D. Ellis and G. Poliner (2006)
 
      Classification-Based Melody Transcription 
    Machine Learning, special issue on Machine Learning In and For Music, vol. 65 no. 2-3, Dec 2006, pp. 439-456. (18pp) 
    DOI: 10.1007/s10994-006-8373-9
 
  
    - 
    M. Mandel, G. Poliner, D. Ellis (2006)
 
      Support Vector Machine Active Learning for Music Retrieval 
    Multimedia Systems, special issue on Machine Learning Approaches to Multimedia Information Retrieval, vol. 12 no. 1, Aug 2006, pp. 3-13. (10pp) 
    DOI: 100.1007/s00530-006-0032-2
 
  
    - 
    D. Ellis, B. Raj, J. Brown, M. Slaney, P. Smaragdis (2006)
 
      Editorial - Special Section on Statistical and Perceptual Audio Processing 
    IEEE Tr. Audio, Speech and Lang. Proc., vol. 14 no 1, Jan. 2006, pp. 2-4. (3pp)
 
  
    - 
    X. Halkias and D. Ellis (2006)
 
      Call detection and extraction using Bayesian inference 
    Applied Acoustics, special issue on Marine Mammal Detection, vol. 67 no. 11-12, Nov-Dec. 2006, pp. 1164-1174. (11pp)
 
  
   - 
   N. Morgan, Q. Zhu, A. Stolcke, K. Sonmez, S. Sivadas, T. Shinozaki, M. Ostendorf, P. Jain, H. Hermansky, D. Ellis, G. Doddington, B. Chen, O. Cetin, H. Bourlard, and M. Athineos (2005)
 
   Pushing the Envelope -- Aside 
   IEEE Signal Processing Magazine 22(5), Sep. 2005, pp. 81-88. (8pp)
 
  
    - 
    J. Barker, M. Cooke, D. Ellis (2005)
 
    Decoding speech in the presence of other sources 
    Speech Communication, 45(1), Jan. 2005, pp. 5-25. (26pp)
 
  
    - 
    M. Cooke and D. Ellis (2004)
 
    Introduction to the special issue on the recognition and organization of real-world sound 
    Speech Communication, 43(4), Sep. 2004, pp. 273-274. (2pp) doi: 10.1016/j.specom.2004.05.001.
 
  
    - 
    A. Berenzweig, B. Logan, D. Ellis, B. Whitman (2004)
 
    A large-scale evaluation of acoustic and subjective music-similarity measures 
    Computer Music Journal, 28(2), June 2004, pp. 63-76. (14pp)
 
  
    - 
    A.J. Robinson, G.D. Cook, D. Ellis, E. Fosler-Lussier, S.J. Renals, D.A.G. Williams (2002)
 
    Connectionist speech recognition of Broadcast News 
    Speech Communication, vol. 37 no. 1-2, May 2002, pp. 27-45. (19pp)
 
  
    - 
    M. Cooke and D. Ellis (2001)
 
    
    The auditory organization of speech and other sources in listeners and computational models
     
    Speech Communication, vol. 35 no. 3-4, Oct. 2001, pp. 141-177. (37pp)
 
  
   - 
   D. Ellis (1998)
 
   
   Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis, and its application to speech/nonspeech mixtures 
   Speech Communication special issue on Computational Auditory Scene Analysis, M. Cooke & H. Okuno, eds., vol. 27 no. 3-4, April 1999, pp. 281-298. (11pp)
 
  
 
 | 
International Conferences (refereed)
   - 
       Z. Chen, B. McFee, D. Ellis (2014)
 
     Speech enhancement by low-rank and convolutive dictionary spectrogram decomposition 
     Proc. Interspeech,(to appear), Singapore, Sep 2014.
 
  
   - 
       D. Ellis and H. Satoh and Z. Chen (2014)
 
     Detecting proximity from personal audio recordings 
     Proc. Interspeech,(to appear), Singapore, Sep 2014.
 
  
   - 
       Colin Raffel and Brian McFee and Eric J. Humphrey and Justin Salamon and Oriol Nieto and Dawen Liang and Daniel P. W. Ellis (2014)
 
     mir_eval: A Transparent Implementation of Common MIR Metrics 
     Proc. ISMIR, (to appear), Taipei, Taiwan, Oct 2014.
 
  
   - 
       B. McFee, D. Ellis (2014)
 
     Analyzing Song Structure With Spectral Clustering 
     Proc. ISMIR, (to appear), Taipei, Taiwan, Oct 2014.
 
  
   - 
       D. Liang, J. Paisley, D. Ellis (2014)
 
     Codebook-based Scalable Music Tagging With Poisson Matrix Factorization 
     Proc. ISMIR, (to appear), Taipei, Taiwan, Oct 2014.
 
  
   - 
      H. Papadopoulos, D. Ellis (2014)
 
      Music-content-adaptive robust principal component analysis for a semantically consistent separation of foreground and background in music audio signals 
   Proc. DAFx, (to appear), Erlangen, Sep 2014.
 
  
  - 
      Z. Chen, H. Papadopoulos, D. Ellis(2014)
 
    Content-adaptive speech enhancement by a sparsely-activated dictionary plus low rank decomposition 
    Proc. HSCMA, Nancy, May 2014.
 
  
  - 
      D. Liang, D. Ellis, M. Hoffman, G. Mysore (2014)
 
    Speech Decoloration Based On The Product-Of-Filters Model 
    Proc. ICASSP, 2400-2404, Florence, May 2014. 
    DOI: 10.1109/ICASSP.2014.6854030
    
 
  
  - 
      B. McFee, D. Ellis (2014)
 
    Better Beat Tracking Through Robust Onset Aggregation 
    Proc. ICASSP, 2154--2158, Florence, May 2014. 
    DOI: 10.1109/ICASSP.2014.6853980
    
 
  
  - 
      B. McFee, D. Ellis (2014)
 
    Learning To Segment Songs With Ordinal Linear Discriminant Analysis 
    Proc. ICASSP, 5197--5201, Florence, May 2014. 
    DOI: 10.1109/ICASSP.2014.6854594
    
 
  
  - 
      M. McVicar, D. Ellis, M. Goto (2014)
 
    Leveraging Repeated Utterances for Improved Transcription of Chorus Lyrics from Sung Audio 
    Proc. ICASSP, 3117-3121, Florence, May 2014. 
    DOI: 10.1109/ICASSP.2014.6854174
    
 
  
  - 
      C. Raffel, D. Ellis (2014)
 
    Estimating Timing and Channel Distortion Across Related Signals 
    Proc. ICASSP, 654-658, Florence, May 2014. 
    DOI: 10.1109/ICASSP.2014.6853677
    
 
  
  - 
      D. Silva, V. de Souza, G. Batista, E. Keogh, D. Ellis (2013)
 
    Applying Machine Learning and Audio Analysis Techniques to Insect Recognition in Intelligent Traps 
    Proc. ICMLA, (to appear), Miami, December 2013.
 
  
  - 
      D. Liang, M. Hoffman, D. Ellis (2013)
 
    Beta Process Sparse Nonnegative Matrix Factorization For Music 
    Proc. ISMIR, (to appear), Curitiba, November 2013.
 
  
  - 
      D. Silva, H. Papadopoulos, G. Batista, D. Ellis (2013)
 
    A Video Compression-Based Approach To Measure Music Structure Similarity 
    Proc. ISMIR, (to appear), Curitiba, November 2013.
 
  
  - 
      Z. Chen and D. Ellis (2013)
 
    Speech Enhancement By Sparse, Low-Rank, And Dictionary Spectrogram Decomposition 
    Proc. IEEE WASPAA, (to appear), Mohonk, October 2013.
 
  
  - 
      D. Gillespie and D. Ellis (2013)
 
    Modeling nonlinear circuits with linearized dynamical models via kernel regression 
    Proc. IEEE WASPAA, (to appear), Mohonk, October 2013.
 
  
  - 
      M. Graciarena, A. Alwan, D. Ellis, H.Franco, L. Ferrer, J. Hansen, A. Janin, B.-S. Lee, Y. Lei, V. Mitra, N. Morgan, S. O. Sadjadi, T.J. Tsai, N. Scheffer, L. N. Tan, B. Williams (2013)
 
    All for One: Feature Combination for Highly Channel-Degraded Speech Activity Detection 
    Proc. Interspeech, Lyon, August 2013, paper 1338.
 
  
    - 
    C. Cotton and D. Ellis (2013)
 
    Subband Autocorrelation Features for Video Soundtrack Classification 
    Proc. ICASSP-13, Vancouver, May 2013, 8663-8666.
 
  
    - 
    K. Su, M. Naaman, A. Gurjar, M. Patel, and D. Ellis (2012)
 
    Making a Scene: Alignment of Complete Sets of Clips based on Pairwise Audio Match 
    Proc. ICMR-12, Hong Kong, June 2012, 26-33.
 
  
    - 
    B.-S. Lee and D. Ellis (2012)
 
    Noise Robust Pitch Tracking by Subband Autocorrelation Classification 
    Proc. Interspeech-12, Portland, September 2012, paper P3b.05.
 
  
    - 
    J. McDermott, D. Ellis, H. Kawahara (2012)
 
    Inharmonic Speech: A Tool for the Study of Speech Perception and Separation 
    Proc. SAPA-SCALE 2012, Portland, September 2012, 114-117.
 
  
    - 
    T. Bertin-Mahieux and D. Ellis (2012)
 
    Large-Scale Cover Song Recognition Using the 2D Fourier Transform Magnitude 
    Proc. ISMIR-12, Porto, October 2012, 241-246.
 
  
    - 
    B. McFee, T. Bertin-Mahieux, D. Ellis, and G. Lanckriet (2012)
 
    The Million Song Dataset Challenge 
    Proc. WWW-2012 AdMIRe Workshop, Lyon, April 2012, 909-916.
 
  
    - 
    T. Bertin-Mahieux, D. Ellis, B. Whitman, and P. Lamere (2011)
 
    The Million Song Dataset 
    Proc. ISMIR pp. 591-596, Miami, October 2011.
 
  
    - 
    D. Ellis, B. Whitman, and A. Porter (2011)
 
    Echoprint - An Open Music Identification Service 
    Proc. ISMIR, late-breaking session, Miami, October 2011.
 
  
    - 
    T. Bertin-Mahieux and D. Ellis (2011)
 
    Large-Scale Cover Song Recognition Using Hashed Chroma Landmarks 
    Proc. IEEE WASPAA, pp. 117-120, Mohonk, October 2011.
 
  
    - 
    C. Cotton and D. Ellis (2011)
 
    Spectral vs. Spectro-Temporal Features for Acoustic Event Detection 
    Proc. IEEE WASPAA, pp. 69-72, Mohonk, October 2011.
 
  
    - 
    T. Bertin-Mahieux, G. Grindlay, R. Weiss, and D. Ellis (2011)
 
    Evaluating music sequence models through missing data 
    Proc. IEEE ICASSP, pp. 177-180, Prague, May 2011.
 
  
    - 
    C. Cotton, D. Ellis , and A. Loui (2011)
 
    Soundtrack classification by transient events 
    Proc. IEEE ICASSP, pp. 473-476, Prague, May 2011.
 
  
    - 
    D. Ellis, X. Zheng, and J. McDermott (2011)
 
    Classifying soundtracks with audio texture features 
    Proc. IEEE ICASSP, pp. 5880-5883, Prague, May 2011.
 
  
    - 
    C. Vezyrtzis, A. Klein, D. Ellis, Y. Tsividis (2011)
 
    Direct Processing of MPEG Audio Using Companding and BFP Techniques 
    Proc. IEEE ICASSP, pp. 361-364, Prague, May 2011.
 
  
    - 
    Y.-G. Jiang, G. Ye, S.-F. Chang, D. Ellis, and A. C. Loui (2011)
 
    Consumer Video Understanding: A Benchmark Database and An Evaluation of Human and Machine Performance 
    Proc. ACM ICMR, article #29, Trento, Apr 2011.
 
  
    - 
    G. Grindlay and D. Ellis (2010)
 
    A Probabilistic Subspace Model for Multi-Instrument Polyphonic Transcription 
    Proc. ISMIR, pp. 21-26, Utrecht, August 2010.
 
  
    - 
    T. Bertin-Mahieux, R. Weiss, and D. Ellis (2010)
 
    Clustering beat-chroma patterns in a large music database 
    Proc. ISMIR, pp. 111-116, Utrecht, August 2010.
 
  
    - 
    D. Ellis, B. Whitman, T. Jehan, and P. Lamere (2010)
 
    The Echo Nest Musical Fingerprint 
    ISMIR Late Breaking Abstracts, Utrecht, August 2010.
 
  
    - 
    D. Ellis and A. Weller (2010)
 
    The 2010 LabROSA chord recognition system 
    MIREX 2010 system abstracts, August 2010.
 
  
    - 
    S. Ravuri and D. Ellis (2010)
 
    Cover Song Detection: From High Scores to General Classification 
    Proc. IEEE ICASSP, pp. 65-68, Dallas, March 2010.
 
  
    - 
    C. Cotton and D. Ellis (2010)
 
    Audio Fingerprinting to Identify Multiple Videos of an Event 
    Proc. IEEE ICASSP, pp. 2386-2389, Dallas, March 2010.
 
  
    - 
    K. Lee, D. Ellis, and A. Loui (2010)
 
    Detecting Local Semantic Concepts in Environmental Sounds using Markov Model based Clustering 
    - Proc. IEEE ICASSP, pp. 2278-2281, Dallas, March 2010.
 
   
    - 
    A. Weller, D. Ellis, and T. Jebara (2009)
 
    Structured Prediction Models for Chord Transcription of Music Audio 
    Proc. ICMLA, Miami Beach FL, December 2009, pp. 590-595.
 
  
    - 
    C. Cotton and D. Ellis (2009)
 
    Finding Similar Acoustic Events using Matching Pursuit and Locality-Sensitive Hashing 
    Proc. WASPAA-09, Mohonk NY, October 2009, pp. 125-128.
 
  
    - 
    C. Smit and D. Ellis (2009)
 
    Guided Harmonic Sinusoid Estimation in a Multi-Pitch Environment 
    Proc. WASPAA-09, Mohonk NY, October 2009, pp. 41-44.
 
  
    - 
    G. Grindlay and D. Ellis (2009)
 
    Multi-Voice Polyphonic Music Transcription Using Eigeninstruments 
    Proc. WASPAA-09, Mohonk NY, October 2009, pp. 53-56.
 
  
    - 
    J. Devaney, M. Mandel, and D. Ellis (2009)
 
    Improving Midi-Audio Alignment with Acoustic Features 
    Proc. WASPAA-09, Mohonk NY, October 2009, pp. 45-48.
 
  
    - 
    M. Mandel and D. Ellis (2009)
 
    The Ideal Interaural Parameter Mask: A Bound on Binaural Separation Systems 
    Proc. WASPAA-09, Mohonk NY, October 2009, pp. 85--88.
 
  
    - 
    W. Jiang, C. Cotton, S.-F. Chang, D. Ellis, and A. Loui (2009)
 
    Short-Term Audio-Visual Atoms for Generic Video Concept Classification 
    Proc. ACM MultiMedia-09, Beijing, October 2009, pp. 5-14.
 
  
    - 
    J. Gudnason, M. Thomas, P. Naylor, and D. Ellis (2009)
 
    Voice Source Waveform Analysis and Synthesis using Principal Component Analysis and Gaussian Mixture Modelling 
    Proc. Interspeech-09, Brighton, September 2009, pp. 108-111.
 
  
    - 
    J. Devaney and D. Ellis (2009)
 
    Handling Asynchrony in Audio-Score Alignment 
    Proc. ICMC-09, Montreal, pp. 29-32, August 2009.
 
  
    - 
    J. B. Boldt and D. Ellis (2009)
 
    A Simple Correlation-Based Model of Intelligibility for Nonlinear Speech Enhancement and Separation 
    Proc. EUSIPCO'09, Glasgow, August 2009, pp. 1849-1853.
 
  
    - 
    R. Weiss and D. Ellis (2009)
 
      A Variational EM Algorithm for Learning Eigenvoice Parameters in Mixed Signals 
      Proc. ICASSP-09, pp. 113-116, Taiwan, April 2009. 
 
  
    - 
    M. Mandel and D. Ellis (2008)
 
      Multiple-Instance Learning For Music Information Retrieval 
      Proc. ISMIR 2008, pp. 577-582, Philadelphia, September 2008.
 
  
   - 
   R. Weiss, M. Mandel, D. Ellis (2008)
 
     Source Separation Based on Binaural Cues and Source Model Constraints 
     Proc. Interspeech-08, pp. 419-422, Brisbane, Australia, September 2008.
 
  
   - 
   K. Hu, P. Divenyi, D. Ellis, Z. Jin, B. Shinn-Cunningham, D. Wang (2008)
 
     Preliminary Intelligibility Tests of a Monaural Speech Segregation System 
     Proc. SAPA-08, pp. 11-16, Brisbane, Australia, September 2008.
 
  
   - 
   A. Lammert, D. Ellis, P. Divenyi (2008)
 
     Data-driven articulatory inversion incorporating articulator priors 
     Proc. SAPA-08, pp. 29-34, Brisbane, Australia, September 2008.
 
  
    - 
    S. Ravuri and D. Ellis (2008)
 
    Stylization of Pitch with Syllable-Based Linear Segments 
    Proc. ICASSP-08 Las Vegas, April 2008, pp. 3985-3988.
 
  
    - 
    D. Ellis, C. Cotton, and M. Mandel (2008)
 
    Cross-Correlation of Beat-Synchronous Representations for Music Similarity 
    Proc. ICASSP-08 Las Vegas, April 2008, pp. 57-60. 
    See also the talk slides.
 
  
    - 
    J. H. Jensen, M. G. Christensen, D. Ellis, and S. H. Jensen (2008)
 
    A Tempo-Insensitive Distance Measure for Cover Song Identification based on Chroma Features 
    Proc. ICASSP-08 Las Vegas, April 2008,  pp. 2209-2212.
 
  
    - 
    K. Lee and D. Ellis (2008)
 
    Detecting Music in Ambient Audio by Long-Window Autocorrelation 
    Proc. ICASSP-08 Las Vegas, April 2008, pp. 9-12.
 
  
    - 
    M. Mandel and D. Ellis (2007)
 
      EM localization and separation using interaural level and phase cues 
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, Mohonk NY, October 2007, pp. 275-278.
 
  
    - 
    R. Weiss and D. Ellis (2007)
 
      Monaural speech separation using source-adapted models 
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, Mohonk NY, October 2007, pp. 114-117.
 
  
    - 
      C. Smit and D. Ellis (2007)
 
      Solo voice detection via optimal cancelation 
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, Mohonk NY, October 2007, pp. 207-210.
 
  
    - 
      G. Poliner and D. Ellis (2007)
 
      Improving generalization for polyphonic piano transcription 
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, Mohonk NY, October 2007, pp. 86-89.
 
  
    - 
    S.-F. Chang, D. Ellis, W. Jiang, K. Lee, A. Yanagawa, A. Loui, J. Luo (2007)
 
    Large-scale multimodal semantic concept detection for consumer video 
    Multimedia Information Retrieval workshop, ACM Multimedia Augsburg, Germany, Sep 2007, pp. 255-264. 
    DOI: 10.1145/1290082.1290118
 
  
    - 
    D. Ellis (2007)
 
    Classifying Music Audio with Timbral and Chroma Features 
    Proc. Int. Conf. on Music Info. Retrieval ISMIR-07 Vienna, Austria, pp. 339-340. 
(See also the poster I presented at ISMIR-07.)
 
  
    - 
    M. Mandel and D. Ellis (2007)
 
    A Web-Based Game for Collecting Music Metadata 
    Proc. Int. Conf. on Music Info. Retrieval ISMIR-07 Vienna, Austria, pp. 365-366. 
    (See also the 6 page tech. report.)
 
  
    - 
    J. H. Jensen, D. Ellis, M. G. Christensen, S. H. Jensen (2007)
 
    Evaluation  Distance Measures Between Gaussian Mixture Models of MFCCs 
    Proc. Int. Conf. on Music Info. Retrieval ISMIR-07 Vienna, Austria, pp. 107-108.
 
  
    - 
    D. Ellis and C. Cotton (2007)
 
    The 2007 LabROSA Cover Song Detection System 
    MIREX 2007 Audio Cover Song Evaluation system description, Sep 2007. (4pp) 
(See also the poster I presented at ISMIR-07.)
 
  
    - 
    A. Doherty, A. Smeaton, K.-S. Lee, and D. Ellis (2007)
 
    Multimodal Segmentation of Lifelog Data 
    Proc. 8th Int. Conf. on Computer-Assisted Information Retrieval RIAO 2007, Pittsburgh, May 2007. (18pp)
 
  
    - 
    J. Ogle and D. Ellis (2007)
 
    Fingerprinting to Identify Repeated Sound Events in Long-Duration Personal Audio Recordings 
    Proc. ICASSP-07 Hawai'i, pp.I-233-236. (4pp)
 
  
    - 
    D. Ellis and G. Poliner (2007)
 
    Identifying Cover Songs With Chroma Features and Dynamic Programming Beat Tracking 
    Proc. ICASSP-07 Hawai'i, pp. IV-1429-1432. (4pp)
 
  
    - 
    M. Mandel, D. Ellis, and T. Jebara (2006)
 
      An EM algorithm for localizing multiple sound sources in reverberant environments 
    Advances Neural Info. Proc. Sys. 19, Vancouver CA, Dec 2006, pp. 953-960. (8pp)
 
  
    - 
    K. Lee and D. Ellis (2006)
 
      Voice Activity Detection in Personal Audio Recordings Using Autocorrelogram Compensation  
    Interspeech ICSLP-06, pp. 1970-1973, Pittsburgh, Oct 2006. (4pp)
 
  
    - 
    M. Mandel and D. Ellis (2006)
   
      A probability model for interaural phase difference 
    Proc. Workshop on Statistical and Perceptual Audition SAPA-06, pp. 1-6, Pittsburgh PA, Oct 2006. (6pp)
 
  
    - 
    R. Weiss and D. Ellis (2006)
   
      Estimating single-channel source separation masks: Relevance Vector Machine classifiers vs. pitch-based masking 
    Proc. Workshop on Statistical and Perceptual Audition SAPA-06, pp. 31-36, Pittsburgh PA, Oct 2006. (6pp)
 
  
    - 
    D. Ellis (2006)
 
      Identifying `Cover Songs' with Beat-Synchronous Chroma Features 
      MIREX 2006 Audio Cover Song Contest system description, Sep 2006. (4pp)
 
  
    - 
    D. Ellis (2006)
 
      Beat Tracking with Dynamic Programming 
      MIREX 2006 Audio Beat Tracking Contest system description, Sep 2006. (3pp)
 
  
    - 
    D. Ellis and R. Weiss (2006)
 
      Model-Based Monaural Source Separation Using a Vector-Quantized Phase-Vocoder Representation 
    Proc. ICASSP-06, Toulouse, May 2006, pp. V-957-960. (4pp)
 
  
    - 
    X. Halkias and D. Ellis (2006)
 
      Estimating the Number of Marine Mammals using Recordings of Clicks from One Microphone 
    Proc. ICASSP-06, Toulouse, May 2006, pp. V-769-772. (4pp).
 
  
    - 
    M. Reyes-Gomez, N. Jojic, and D. Ellis (2005)
 
    Deformable Spectrograms 
    AI & Statistics 2005, Barbados, Jan 2005 pp. 285-292. (8pp)
 
  
    - 
    G. Poliner, D. Ellis (2005)
 
    A Classification Approach to Melody Transcription 
    Proc. Int. Conf. on Music Info. Retrieval ISMIR-05, London, September 2005, pp.161-166. (6pp)
 
  
    - 
    M. Mandel, D. Ellis (2005)
 
    Song-Level Features and Support Vector Machines for Music Classification 
    Proc. Int. Conf. on Music Info. Retrieval ISMIR-05, London, September 2005, pp.594-599. (6pp)
 
  
    - 
	K. Dobson, B. Whitman, D. Ellis (2005)
 
    Learning Auditory Models of Machine Voices 
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-05, Mohonk NY, October 2005, pp. 339-342. (4pp)
 
  
   - 
   C.-P. Chen, J. Bilmes, D. Ellis (2005)
 
   Speech Feature Smoothing for Robust ASR 
   Proc. ICASSP-05, Philadelphia, March 2005, pp. I-525-528. (4pp)
 
  
    - 
    N. Lesser, D. Ellis (2005)
 
    Clap Detection and Discrimination for Rhythm Therapy 
    Proc. ICASSP-05, Philadelphia, March 2005, pp. III-37-40. (4pp) 
    (See also the talk slides which describe an energy ratio feature that does much better than the ones described in the paper.)
 
  
    - 
    M. Athineos, H. Hermansky and D. Ellis (2004)
 
    LP-TRAP: Linear predictive temporal patterns 
    International Conference on Spoken Language Processing ICSLP-04, Jeju, Korea, Oct 2004, pp. 949-952. (4pp)
 
  
    - 
    M. Athineos, H. Hermansky and D. Ellis (2004)
 
    PLP^2: Autoregressive modeling of auditory-like 2-D spectro-temporal patterns 
    ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing SAPA-04, Jeju, Korea, Oct 2004, pp. 37-42. (5pp)
 
  
    - 
    M. Reyes-Gomez, N. Jojic, and D. Ellis (2004)
 
    
Towards single-channel unsupervised source separation of speech mixtures: The layered harmonics/formants separation-tracking model 
    ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing SAPA-04, Jeju, Korea, Oct 2004, pp. 25-30. (6pp)
 
  
    - 
    D. Ellis and J. Liu (2004)
 
    Speaker turn segmentation based on between-channel differences 
    NIST Meeting Recognition Workshop @ ICASSP, pp. 112-117, Montreal, May 2004. (6pp)
 
  
    - 
    L. Kennedy and D. Ellis (2004)
 
    Laughter Detection in Meetings 
    NIST Meeting Recognition Workshop @ ICASSP, pp. 118-121, Montreal, May 2004. (4pp)
 
  
    - 
    M.J. Reyes-Gomez, D. Ellis, N. Jojic (2004)
 
    Multiband Audio Modeling for Single Channel Acoustic Source Separation 
    Proc. ICASSP-04, pp. V-641-644, Montreal, May 2004. (4pp)
 
  
    - 
    M.J. Reyes-Gomez, N. Jojic, D. Ellis (2004)
 
    Detailed graphical models for source separation and missing data interpolation in audio 
    Snowbird Learning Workshop, Snowbird, 2004. (2pp)
 
  
    - 
    D. Ellis and J. Arroyo (2004)
 
    Eigenrhythms: Drum pattern basis sets for classification and generation  
    International Symposium on Music Information Retrieval ISMIR-04, Barcelona, Oct 2004, pp. 554-559. (6pp)  
  (longer tech report version with color figures)
 
  
    - 
    B. Whitman and D. Ellis (2004)
 
    Automatic Record Reviews 
    International Symposium on Music Information Retrieval ISMIR-04, Barcelona, Oct 2004, pp. 470-477. (8pp)
 
  
    - 
    D. Ellis and K.S. Lee (2004)
 
    Minimal-Impact Audio-Based Personal Archives 
    First ACM workshop on Continuous Archiving and Recording of Personal Experiences CARPE-04, New York, Oct 2004, pp. 39-47. (9pp)
 
  
    - 
    D. Ellis and K.S. Lee (2004)
 
    Features for Segmenting and Classifying Long-Duration Recordings of Personal Audio 
    ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing SAPA-04, Jeju, Korea, Oct 2004, pp. 1-6. (6pp)
 
  
    - 
    L. Kennedy and D. Ellis (2003)
 
    Pitch-based emphasis detection for characterization of meeting recordings 
    Automatic Speech Recognition and Understanding Workshop IEEE ASRU 2003, pp. 243-248, St. Thomas, December 2003. (6pp)
 
  
    - 
    M. Athineos and D. Ellis (2003)
 
    Frequency-domain linear prediction for temporal features 
    Automatic Speech Recognition and Understanding Workshop IEEE ASRU 2003, pp. 261-266, St. Thomas, December 2003. (6pp)
 
  
    - 
    M.J. Reyes-Gomez, B. Raj, D. Ellis (2003)
 
    Multi-channel Source Separation by Beamforming Trained with Factorial HMMs 
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. 
  and Audio, pp. 13-16, Mohonk NY, October 2003. (4pp)
 
  
    - 
    P. Scanlon, D. Ellis, R. Reilly (2003)
 
    Using Mutual Information to design class-specific phone recognizers 
    Proc. Eurospeech-03, Geneva, September 2003, pp. 857-860. (4pp)
 
  
    - 
    S. Renals and D. Ellis (2003)
 
    Audio Information Access from Meeting Rooms 
    Proc. ICASSP-03, Hong Kong, April 2003, pp. IV-744--747. (4pp)
 
  
    - 
    M.J. Reyes-Gomez, B. Raj, D. Ellis (2003)
 
    Multi-channel Source Separation by Factorial HMMs 
    Proc. ICASSP-03, Hong Kong, April 2003, pp. I-664--667. (4pp)
 
  
    - 
    A. Janin, D. Baron, J. Edwards, D. Ellis, D. Gelbart, N. Morgan, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, C. Wooters (2003)
 
    The ICSI Meeting Corpus 
    Proc. ICASSP-03, Hong Kong, April 2003. pp. I-364--367. (4pp)
 
  
    - 
    M. Athineos and D. Ellis (2003)
 
    Sound Texture Modelling with Linear Prediction
in both Time and Frequency Domains 
    Proc. ICASSP-03, Hong Kong, April 2003, pp. V-648--651. (4pp)
 
  
    - 
    A. Sheh and D. Ellis (2003)
 
    Chord Segmentation and Recognition using EM-Trained Hidden Markov Models 
    4th International Symposium on Music Information Retrieval ISMIR-03, pp. 185-191, Baltimore, October 2003. (7pp)
 
  
    - 
    R. Turetsky and D. Ellis (2003)
 
    Ground-Truth Transcriptions of Real Music from Force-Aligned MIDI Syntheses 
    4th International Symposium on Music Information Retrieval ISMIR-03, pp. 135-141, Baltimore, October 2003. (7pp)
 
  
    - 
	A. Berenzweig, B. Logan, D. Ellis, B. Whitman (2003)
 
    A large-scale evaluation of acoustic and subjective music similarity measures 
    4th International Symposium on Music Information Retrieval ISMIR-03, pp. 103-109, Baltimore, October 2003. (7pp)
 
  
    - 
    B. Logan, D. Ellis, A. Berenzweig (2003)
 
    Toward evaluation techniques for music similarity 
    Keynote address, Workshop on the Evaluation of Music Information Retrieval (MIR) Systems at SIGIR 2003, Toronto, August 2003. (5pp)
 
  
    - 
    A. Berenzweig, D. Ellis & S. Lawrence (2003)
 
    Anchor Space for Classification and Similarity Measurement of Music 
    Proc. ICME-03, Baltimore, July 2003, pp. I-29--32. (4pp)
 
  
    - 
    M.J. Reyes-Gomez and D. Ellis (2003)
 
    Selection, Parameter Estimation, and Discriminative Training of Hidden Markov Models for General Audio Modeling 
    Proc. ICME-03, Baltimore, July 2003, pp. I-73--76. (4pp)
 
  
    - 
    M.J. Reyes-Gomez and D. Ellis (2002)
 
    Error visualization for tandem acoustic modeling on the Aurora task
     
    ICASSP-02 (student session), Orlando, May 2002. (4pp)
 
  
    - 
    D. Ellis, B. Whitman, A. Berenzweig, S. Lawrence (2002)
 
    The Quest for Ground Truth in Musical Artist Similarity 
    Proc. ISMIR-02, pp. 170-177, Paris, October 2002. (8pp)
 
  
    - 
    A. Berenzweig, D. Ellis, S. Lawrence (2002)
 
    
      Using Voice Segments to Improve Artist Classification of Music 
    Proc. AES-22 Intl. Conf. on Virt., Synth., and Ent. Audio.
    Espoo, Finland, June 2002. (8pp)
 
  
    - 
    T. Pfau, D. Ellis, A. Stolcke (2001)
 
    
    Multispeaker Speech Activity Detection for the ICSI Meeting Recorder
     
    Proc. ASRU-01, Italy, December 2001. (4pp)
 
  
    - 
    J. Barker, M. Cooke, D. Ellis (2001)
 
    
    Integrating bottom-up and top-down constraints to achieve robust ASR: The multisource decoder 
    Presented at the CRAC workshop, pp. 63-66, Aalborg, Denmark, September 2001. (4pp)
 
  
    - 
    D. Ellis and M.J. Reyes Gomez (2001)
 
    
    Investigations into Tandem Acoustic Modeling for the Aurora Task
     
    Proc. Eurospeech-01, Special Event on Noise Robust Recognition, pp. 189-192, 
    Denmark, September 2001. (4pp) 
    (See also the poster I presented at the conference.)
 
  
    - 
    D. Ellis, R. Singh, S. Sivadas (2001)
 
    
    Tandem acoustic modeling in large-vocabulary recognition
     
    Proc. ICASSP-2001, pp. I-517-520, Salt Lake City, May 2001. (4pp) 
      (See also the poster I presented at the conference.)
 
  
    - 
    N. Morgan, D. Baron, J. Edwards, D. Ellis, D. Gelbart, A. Janin, T. Pfau, E. Shriberg, A. Stolcke (2001)
 
    
    The Meeting Project at ICSI 
    Human Language Technologies Conference, San Diego, March 2001, pp. 246-252. (7pp)
 
  
    - 
    A.L. Berenzweig and D. Ellis (2001)
 
    
      Locating Singing Voice Segments within Music Signals 
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. 
    and Audio, pp. 119-122, Mohonk NY, October 2001. (4pp)
 
  
    - 
    D. Ellis (2001)
 
    
    Detecting Alarm Sounds 
    Presented at the CRAC workshop, 
     pp. 59-62, Aalborg, Denmark, September 2001. (4pp) 
     (See also the poster I presented at the workshop.)
 
  
    - 
    D. Ellis and J.A. Bilmes (2000)
 
    
    Using mutual information to design feature combinations 
    Proc. ICSLP-2000, Beijing, October 2000. (4pp)
 
  
    - 
    J. Barker, M. Cooke and D. Ellis (2000)
 
    
    Decoding speech in the presence of other sound sources 
    Proc. ICSLP-2000, Beijing, October 2000. (4pp)
 
  
    - 
    J. Ferreiros-Lopez and D. Ellis (2000)
 
    
    Using acoustic condition clustering to improve acoustic change detection on Broadcast News 
    Proc. ICSLP-2000, Beijing, October 2000. (4pp)
 
  
    - 
    D. Ellis (2000)
 
    
    Improved recognition by combining different features and different systems 
    Proc. AVIOS-2000, San Jose, May 2000. (7pp)
 
  
    - 
    H. Hermansky, D. Ellis and S. Sharma (2000)
 
    
    Tandem connectionist feature stream extraction for conventional HMM systems 
    Proc. ICASSP-2000, Istanbul, III-1635-1638. (4pp) 
        (See also the poster I presented at the conference.)
 
  
    - 
    S. Sharma, D. Ellis, S. Kajarekar, P. Jain and H. Hermansky (2000)
 
    
    Feature extraction using non-linear transformation for robust speech recognition on the Aurora database 
    Proc. ICASSP-2000, Istanbul, II-1117-1120. (4pp)
 
  
    - 
    D. Genoud, D. Ellis and N. Morgan (1999)
 
    
    Combined speech and speaker recognition with speaker-adapted connectionist models 
    Proc. Auto. Speech Recog. & Understanding Workshop, Keystone. (4pp)
 
  
    - 
    D. Abberley, S. Renals, T. Robinson and D. Ellis (1999)
 
    
    The THISL SDR system at TREC-8 
   Proc. Text Retrieval Conference 8, Washington. (6pp)
 
  
   - 
   G. Williams and D. Ellis (1999)
 
   
   Speech/music discrimination based on posterior probability features 
   Proc. Eurospeech-99, Budapest. (4 pp)
 
  
   - 
   A. Janin, D. Ellis and N. Morgan (1999)
 
   
   Multi-stream speech recognition: Ready for prime time? 
   Proc. Eurospeech-99, Budapest. (4 pp)
 
  
   - 
   D. Ellis and N. Morgan (1999)
 
   
   Size matters: An empirical study of neural network training for large vocabulary continuous speech recognition 
   Proc. ICASSP-99, Phoenix. (4 pp)
 
  
   - 
   N. Morgan, D. Ellis, E. Fosler-Lussier, A. Janin and B. Kingsbury (1999)
 
   
   Reducing errors by increasing the error rate: MLP Acoustic Modeling for Broadcast News Transcription 
   Presented at the DARPA Broadcast News Transcription and Understanding Workshop, Gaithersburg VA, 1999feb28. (4pp)
 
  
   - 
   G. Cook, J. Christie, D. Ellis, E. Fosler-Lussier, Y. Gotoh, B. Kingsbury, N. Morgan, S. Renals, T. Robinson and G. Williams (1999)
 
   
   The SPRACH System for the Transcription of Broadcast News 
   Presented at the DARPA Broadcast News Transcription and Understanding Workshop, Gaithersburg VA, 1999feb28. (4pp)
 
  
  - 
  D. Ellis (1997)
 
  
  The Weft: A representation for periodic sounds 
  Proc. Int. Conf. on Acous., Speech & Sig. Proc. ICASSP-97, Munich, vol. 2 pp. 1307-1310, April 1997.  (4pp) 
      (See also the poster I presented at the conference.)
 
  
   - 
   D. Ellis (1997)
 
   
   Computational Auditory Scene Analysis exploiting Speech-Recognition knowledge 
   Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio, Mohonk, October 1997. (4pp)
 
  
   - 
   D. Ellis (1996)
 
   
   Prediction-driven computational auditory scene analysis for dense sound mixtures 
   Proc. ESCA Workshop on the Auditory Basis of Speech Perception, Keele, July 1996. (6pp)
 
  
   - 
   D. Ellis (1995)
 
   
   Underconstrained stochastic representations for top-down computational auditory scene analysis 
   Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio, Mohonk, October 1995. (4pp)
 
  
   - 
   D. Ellis (1994)
 
   
   A computer implementation of psychoacoustic grouping rules 
   Proc. 12th Intl. Conf. on Pattern Recognition, Jerusalem, October 1994. (9pp)
 
  
   - 
   D. Ellis (1993)
 
   
   Hierarchic models of sound for separation and restoration 
   Proc. 1993 IEEE Mohonk workshop on Applications of Signal Processing to Acoustics and Audio, October 1993. (4pp)
 
  
 
 |