OTANALYZE - Utility to decompose noisy audio files
otanalyze is a (compiled) Matlab script that, given a clean signal and a noisy, filtered version of it, can be used to separate the noise and characterize the filtering that was applied to the clean original. This can be useful, for instance, for comparing an original music track to a version recorded "over the air", e.g., by using a portable recorder to re-record playback over speakers.
otanalyze defines a number of concepts: CLEAN is original, clean speech, TARGET is a clean signal that is noise-free but has been filtered by a channel (defined by an FIR filter FILTER); NOISE is an additional noise component, and MIX is a combination of TARGET and NOISE.
Contents
Time alignment
In its simplest form, otanalyze can be used to find the time alignment between a reference track and a short excerpt (perhaps recorded over the air). The command below considers "clean" to be the reference, and "mix" to be the short excerpt (which is assumed to be a filtered version of an excerpt from "clean", with added noise). The "SNR" estimate can be used to judge whether this is a true match - a chance match will yeild an SNR below -3 dB.
otanalyze -clean 13693.mp3 -mix 13693.wav
++++++ otanalyze v0.71 ++++++ Reading MIX from 13693.wav ... Reading CLEAN from 13693.mp3 ... Identifying CLEAN in MIX... Mix delay= 24.654 s Mix SNR= 8.35 dB
Signal decomposition
The usage example breaks a MIX into a TARGET (matching a CLEAN) and a NOISE. The command below performs the decomposition, and saves out the components. The input MIX is the output NOISE and TARGET summed together, and the output TARGET is approximately the output FILTER applied to the input CLEAN (only approximately because the estimated filter can be slowly time-varying, and there is an internal timing skew adjustment that would not be preserved). The otanalyze command is designed to be called from the command line, so it reads and writes all data to/from sound files:
otanalyze -mix 13693.wav -clean 13693.mp3 -noiseout noise.wav -targetout targ.wav -filterout filt.wav -disp 1
++++++ otanalyze v0.71 ++++++ Reading MIX from 13693.wav ... Reading CLEAN from 13693.mp3 ... Identifying CLEAN in MIX... Mix delay= 24.654 s Mix SNR= 8.35 dB FILTER saved to filt.wav NOISE saved to noise.wav TARGET saved to targ.wav
Command line options
All parameters to otanalyze are specified in the command line via "-optionname value" pairs. The full set of options is:
otanalyze -help
otanalyze v0.71 of 20130125 usage: otanalyze ... -clean <filename> The name of the clean (reference) sound file -targetout <filename> Where to save extracted target speech -mix <filename> Input mixture of noise and channel-filtered target -noiseout <filename> Where to write the separated noise residual -filterout <filename> Where to save the estimated channel response -start <time_secs> Start processing at this time in the files -end <time_secs> Finish processing at this point in the files -cleanskip <time_secs> Drop time from start of CLEAN (or MIX if -ve) -samplerate <sr_hz> Resample to (low) sampling rate before xcorr -disp <bool> display graphics if set
Direct functions
The otanalyze script is mainly concerned with handling file input and output, and in deciding which functions (separation, remixing, etc.) to perform. For use within Matlab, you can access the following functions to directly perform these functions:
- [noise, targ, filt, SNR] = find_in_mix(mix, clean, sr) - takes waveform vectors for MIX and CLEAN, and extracts NOISE, TARG, and FILT vectors, as well as returning effective SNR. find_in_mix relies on find_skew to make a rough alignment between MIX and CLEAN, then decomp_lin_win and decomp_lin, which further uses whiten, to perform the actual decomposition.
Installation
This package has been compiled for several targets using the Matlab compiler. You will also need to download and install the Matlab Compiler Runtime (MCR) Installer. Please see the table below:
Architecture | Compiled package | MCR Installer |
---|---|---|
64 bit Linux | otanalyze_GLNXA64.zip | Linux 64 bit MCR Installer |
64 bit MacOS | otanalyze_MACI64.zip | MACI64 MCR Installer |
The original Matlab code used to build this compiled target is available at
<http://labrosa.ee.columbia.edu/projects/otanalyze/>
All sources are in the package otanalyze-v0.71.zip.
Feel free to contact me with any problems.
Notes
audioread is able to read a wide range of sound file types, but relies on a number of other packages and/or support functions being installed. Most obscure of these is ReadSound, a MEX wrapper I wrote for the dpwelib sound file interface.
Changelog
% v0.71 2013-01-25 Modified shell script to handle file names with % spaces; increased pre-peak collar in find_in_mix % from 5ms to 10ms; supports Tfilt (estimated FIR % duration) as -tfilt parameter (default still % 0.040 sec). % % v0.7 2012-06-14 Fixed case where one signal is pure zero. Now % reports SNR as -Inf. % % v0.6 2012-01-20 Improved calculation of cross-correlation when % clean is very long; fixed bug with odd-length query. % % v0.5 2011-09-20 Fixed bug where dclean.wav and dmix.wav were % always written out, and fixed bug where run_otanalyze_prj.sh % -help resulted in a long string of Matlab Runtime error messages. % % v0.4 2011-09-08 Extended efficient handling of subsets of very % long files to WAV and AIF files too. Fixed bug where reported % delays were always off by 1.0 sec. % % v0.3 2011-09-07 Improved handling of long files, especially for % m4a (aac) files. Added -help option to print help message. % Now works even if "mix" starts earlier than "clean". % % v0.2 2011-09-06 Introduced skewview's -samplerate option for more % memory-efficient cross-correlations (at lower sampling rates). % % v0.1 2011-08-22 Adapted from renoiser. % Last updated: $Date: 2011/08/22 20:21:23 $ % Dan Ellis <dpwe@ee.columbia.edu>