NEMISIG 2014

About

NEMISIG (North East Music Information Special Interest Group) is a yearly informal workshop which brings together researchers from various institutions in the Northeastern United States who work on music information retrieval. The meeting provides a space for open discussion of new ideas and intends to foster intercollegiate collaboration. Previous NEMISIG meetings have been held at Columbia University (2008), New York University (2010), Drexel University (2011), Dartmouth College (2012), and The Echo Nest (2013).

NEMISIG 2014 was held on January 25th at Columbia University in the newly opened Studio@Butler. The event was immediately followed by Hacking Audio and Music Research 2014 so that attendees had the opportunity to develop new ideas and collaborations. NEMISIG 2014 was organized by LabROSA and was supported by the Columbia University Electrical Engineering Department and the School of Engineering and Applied Science.

Schedule

9:30 AM - Coffee and bagels
10:00 AM - Opening remarks
10:10 AM - Lab overview talks

10:10 AM - Doug Turnbull - JimiLab @ Ithaca [Slides]

A summary of music-IR related research at Ithaca College and Cornell University.

10:25 AM - Eric Humphrey - MARL @ New York University [Slides]

The Music and Audio Research Laboratory (MARL) brings together researchers from various disciplines including music psychology, computer and information science, theory and composition, and interactive performance systems to advance the state of the art in music technologies. This talk will detail ongoing research projects in the realm of music informatics and present recent progress in these areas.

10:40 AM - Youngmoo Kim - Drexel ExCITe Center [Slides]

The Expressive and Creative Interaction Technologies (ExCITe) Center is a strategic initiative of Drexel University bringing together faculty, students, and entrepreneurs from engineering, computer and information science, digital media, fashion design, performing arts, product design, and many other fields to pursue highly multi-disciplinary collaborative projects. ExCITe encompasses several research groups at Drexel, including the Music & Entertainment Technology Laboratory (MET-lab). The research and education activities of the ExCITe Center emphasize the arts-integrated approach of STEAM [vs. traditional STEM]. Additionally, ExCITe serves to connect knowledge and resources across Philadelphia through civic, arts and culture, and industry partnerships with other institutions and organizations in the region.

10:55 AM - Dan Ellis - LabROSA @ Columbia University [Slides]

The Laboratory for the Recognition and Organization of Speech and Audio (LabROSA) conducts research into automatic means of extracting useful information from sound. Our vision is of an intelligent 'machine listener', able to interpret live or recorded sound of any type in terms of the descriptions and abstractions that would make sense to a human listener. This talk will provide a brief overview of the current research being conducted at LabROSA.

11:10 AM - Douglas Eck - Music Research at Google [Slides]

I will provide a casual, high-level overview of music recommendation and discovery research at Google. I'll focus on challenges in combining audio and collaborative filtering models in production. Finally I'll discuss the use of Knowledge Graph as a means for providing structured data about the world of music.

11:30 AM - Coffee break
11:45 AM - Project talks

11:45 AM - Dawen Liang - Bayesian hierarchical modeling for music and audio processing at LabROSA [Slides]

In this talk, I will briefly talk about two Bayesian hierarchical models I've been working on for music and audio processing: beta process sparse NMF, a Bayesian nonparametric extension of the traditional NMF, which infers the number of components as part of the inference, and Product-of-Filters (PoF), a novel model which decomposes monophonic sounds into simpler systems.

12:05 PM - Oriol Nieto - Music Segment Similarity Using 2D-Fourier Magnitude Coefficients [Slides]

Music segmentation is the task of automatically identifying the different segments of a piece. In this work we present a novel approach to cluster the musical segments based on their acoustic similarity by using 2D-Fourier Magnitude Coefficients (2D-FMCs). These coefficients, computed from a harmonic representation, significantly simplify the problem of clustering the different segments since they are key transposition and phase shift invariant. We explore various strategies to obtain the 2D-FMC patches that represent entire segments and apply k-means to label them. Finally, we discuss possible ways of estimating k and compare our competitive results with the current state of the art.

12:25 PM - Brian McFee - What is librosa and why should you use it? [Slides]

librosa is a python package for music and audio processing. It includes the most common high- and low-level routines needed to extract information from music an audio. This talk will consist of a live demo of its functionality and discuss some of its future goals.

12:45 PM - Lunch and poster session

David Rosen - The Impact of Explicit Instructions, Personality, Affect, and Domain-Expertise on Creative Improvisation amongst Jazz Pianists

The present study’s primary objective is to determine if jazz improvisation performance amongst expert jazz pianists significantly differs in creativity when given explicit instructions to be creative. Furthermore, this work seeks to better understand and explore potential factors, which may contribute to individuals’ enhanced or stifled creative performance under this conditional manipulation. This study will collect demographic and domain-expertise information, two personality inventories (Big Five Inventory and Frost Multidimensional Perfectionism Scale), and pre-post affect surveys (Positive and Negative Affect Scale) from all participants to evaluate the relationship between these factors and creative performance on a jazz improvisation task when explicitly instructed to be creative. In addition, musicians will take part in a brief interview after the experiment in order to discuss cognitive strategies implemented and to understand participants’ notions of key features of creative jazz improvisation.

David Grunberg - Music-IR In Noisy Acoustic Environments

I am developing Music-IR systems that can perform accurately even in the presence of acoustic noise. While much work has been done in the field of Music-IR over the past years, most of the research community’s focus has been on clean audio signals that are taken directly from professionally-recorded CDs. We humans, however, often listen to music over physical acoustic channels, with the result that the music we hear is contaminated by other sounds from the environment, or noise. It would be useful to users as well as the research community if Music-IR systems could function on this type of audio as well, but noise tends to degrade the performance of even state of the art Music-IR algorithms. I am therefore researching the development of algorithms that are more robust to acoustic noise.

Matthew Prockup - Content Based Analysis of Expressive Percussion

Musical expression is the creative nuance through which a musician conveys emotion and connects with a listener. In this work, we present a system that seeks to classify different expressive articulation techniques independent of percussion instrument. One use of this system is to enhance the organization of large percussion sample libraries, which can be cumbersome and daunting to navigate. This work is also a necessary first step towards understanding musical expression as it relates to percussion performance. The ability to classify expressive techniques can lead to the development of models that learn the the functionality of articulations in patterns, as well as how certain performers use them to communicate and define their musical style.

Brandon Morton - Predicting Musical Influence Using Non-Negative Matrix Factorization

All musicians have been influenced by previous generations of artists. By examining the flow of influence, we have an interesting look into how music has evolved and been adapted throughout history. For this project, we are attempting to find a way to predict musical influence relationships between artists, using only content-based methods and non-negative tensor factorization. By using NTF, we hope to keep some of the data lost when using other factorization methods. We have collected data using the All Music Guide and the 7Digital API and hope that this information will provide us a starting point for a deeper understanding of influence between musicians.

Jeff Gregorio - Analysis of Self-Consistency in Piano Performance

The vast majority of keyboard musical instruments today communicate key events using the Musical Instrument Digital Interface (MIDI) standard, typically allowing only information about event timing and key velocity, which is insufficient to capture the nuances of piano touch. Use of a continuous key position tracking system affords a myriad of possibilities for expressive control, yet the question arises as to what degree keyboardists of different levels of training can consistently reproduce the finer details of expression. In order to inform the design of musical interfaces utilizing continuous key position, we test the hypothesis that intended expression is indeed readily reproducible in terms of commonly used features. Toward this end, participants are asked to perform excerpts as consistently as possible, where features of key movement and timing are computed and analyzed. Highly correlated features are identified as of primary utility in creating novel mappings to synthesizer parameters.

Alyssa Batula - Musical Humanoid Robots

We aim to develop a general-purpose humanoid robot capable of participating in live human-robot musical ensembles. In order to fully participate in an ensemble, robotic musicians must be able to listen to the audio and identify high-level features of the music (e.g. score location, tempo, key) in order to adjust their performance in real time to match their fellow musicians. Our efforts so far have focused on Hubo, an adult-sized humanoid robot. In order to allow the robot to adjust its performance, we have developed several algorithms that analyze acoustic music and extract features such as beat locations, tempo, pitch and mood. An algorithm using both audio and haptic feedback detects misplayed notes on a pitched pipe instrument. Real-time beat and mood detection are used to change the robot's dance motions to fit the mood and tempo of the music. A fully dexterous, low cost,3D printed robotic hand is capable of executing the range of motion required to play the piano. Finally, a power model is being developed in order to predict the robot's power consumption as it moves, which could improve its overall efficiency.

Michael Caro - Emotion recognition in music using neural networks

Deep recurrent neural networks are intrinsically different than deep neural networks with regards to how they operate on an input space but also on an internal space which accounts for information that has already been processed by the network. In previous works, researchers have focused on pre-training techniques involving unsupervised processing of large datasets. That approach has shown reasonable improvements over earlier works because deep neural networks can learn hierarchies of features. To that end, deep networks can be thought of as a processing pipeline. Each part of the network processes a certain part of the data and moves it on to the next. But the strength of basic recurrent neural net is that it introduces memory -- not feature hierarchies. Therefore it is only natural that stacking neural networks with temporal feedback loops will provide the model with the ability to learn temporal hierarchies. In this new work, we seek to develop a model that can learn about the temporal evolution of music and its relation to emotion using the `1000 song dataset for the emotional analysis of music.'

Jeffrey Scott - Unsupervised Clustering of Multi-Track Instruments

Achieving balance and clarity in a mixture comprising a multitude of instrument layers requires experience in evaluating and modifying the individual elements and their sum. Creating a mix involves many technical concerns (level balancing, dynamic range control, stereo panning, spectral balance) as well as artistic decisions (modulation effects, distortion effects, side-chaining, etc.). This work is a first step toward developing a continuous space that organizes instrument tracks in terms of their spectro-temporal characteristics specifically for mixing. The goal of the space is to be able to directly map music production mixing parameters to coordinates associated with specific sonic characteristics. This approach explores the efficacy of multiple time scale feature integration and k-means to organize the instruments in a perceptually meaningful manner.

Adeesha Ekanayake - MeUse: Recommending Internet Radio Stations

We describe a novel Internet radio recommendation system called MeUse. We use the Shoutcast API to collect historical data about the artists that are played on a large set of Internet radio stations. This data is used to populate an artist-station index that is similar to the term-document matrix of a traditional text-based information retrieval system. A small-scale user study suggests that the majority of users enjoyed using MeUse but that providing additional contextual information may be needed to help with recommendation transparency.

Dawen Liang - A Generative Product-of-Filters Model of Audio

We describe product-of-filters (PoF) model, a generative model for decomposing audio spectrograms as linear combination of "filters" in the log-spectral domain. The theoretical justification behind PoF model is the homomorphic filtering proposed for audio signal processing. We formulate the model and present a mean-field method for posterior inference and a variational EM algorithm to estimate the parameters under the maximum marginal likelihood setting. Two tasks are used to evaluate the effectiveness of PoF model: bandwidth expansion and speaker identification, and we see improvement on both tasks compared to the other widely used approaches.

2:00 PM - Breakout discussion planning
2:10 PM - Breakout session 1
3:00 PM - Breakout session 2
3:50 PM - Recap and wrap-up
4:30 PM - HAMR 2014 begins

Attendees

Click here for a photo of the attendees.

Name	Affiliation
Kenshin Maybe	Boston University
Brian House	Brown University
Brian McFee	Columbia University
Emily Chen	Columbia University
Helene Papadopoulos	Columbia University
Nick Patterson	Columbia University
Tad Shull	Columbia University
Zhuo Chen	Columbia University
Colin Raffel	Columbia University
Dan Ellis	Columbia University
Dawen Liang	Columbia University
Justin Zupnick	Cornell University
Andy Sarroff	Dartmouth College
Alyssa Batula	Drexel University
Brandon Morton	Drexel University
David Grunberg	Drexel University
David Kurt Grunberg	Drexel University
David Rosen	Drexel University
Evan Dissanayake	Drexel University
Evan Dissanayake	Drexel University
Jeff Gregorio	Drexel University
Jeffrey Scott	Drexel University
Matthew Prockup	Drexel University
Michael N. Caro	Drexel University
William Hilton	Drexel University
Youngmoo Kim	Drexel University
Ali Razavi	Google
Daniel Steinberg	Google
Douglas Eck	Google
Kevin Wilson	Google
Maneesh Bhand	Google
Philippe Hamel	Google
Ron Weiss	Google
Sourish Chaudhuri	Google
Tony Lam	Google
Steven Rennie	IBM
Lise Regnier	Ircam
Adeesha Ekanayake	Ithaca College
Douglas Turnbull	Ithaca College
Kristofer Stensland	Ithaca College
Laurence Welch	Ithaca College
Mahtab Ghamsari	McGill University
John Hershey	MERL
Jonathan Le Roux	MERL
Jeremy Sawruk	None
Brian Kolterman	None
Daniel Cashin	None
Thomas Wilson	None
Amar Lal	NYU
Eric Humphrey	NYU
Jon Forsyth	NYU
Justin Salamon	NYU
Michael Musick	NYU
Oriol Nieto	NYU
Rachel Bittner	NYU
Tlacael Esparza	NYU
Zeyu Jin	Princeton University
Alex Cannon	Swarthmore College
Andreas Jansson	The Echo Nest
Hunter McCurry	The Echo Nest
Nicola Montecchio	The Echo Nest

Mailing list/contact

For updates, questions, and suggestions, please join the NEMISIG Google group.