CREATE_WSJ - script to generate noised-up version of WSJ
Contents
Introduction
renoiser is a tool to analyze the channel of clean speech in a filtered, noisy channel, and optionally to recombine the same or different speech, at any SNR, using the extracted channel filter, with the residual noise. This script shows how to generate a filtered- and noised-up version of an existing speech corpus. Specifically, we use the RATS rebroadcast example signals (LDC2011E20) to estimate some noise/filter characteristics, then apply them to one directory from the WSJ speech corpus.
First, we analyze each of the channel charactersitcs:
droot = ['../../data/LDC2011E20/data/default/' ... '20110316_145021_recvrcali_default_']; tst = 8.6; tend = 16.0; list = '4bj1-2.txt'; mixoutbase = 'WSJ/ch'; % There's really no energy above 4 kHz TARGETSR = 8000; % all 8 channels for chan = 'ABCDEFGH'
Analysis
% Create a per-channel output directory dirname = [mixoutbase,chan]; mkdir(dirname); noisename = fullfile(dirname,'noise.wav'); filtername = fullfile(dirname,'filter.wav'); % Analyze the channel, remember the reported SNR [d,sr,SNR,fshift] = renoiser('-clean', [droot,'REF.flac'], ... '-mix', [droot,chan, '.flac'], ... '-disp', 1, '-targetsr', TARGETSR, ... '-noisefloor', '-30', ... '-checkfshift', '1', ... '-start', tst, '-end', tend, ... '-targetout', fullfile(dirname,'target.wav'), ... '-noiseout', noisename, ... '-filterout', filtername);
Warning: Directory already exists. ++++++ renoiser ++++++ Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000 Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_A.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_A from 16000 to 8000 Identifying CLEAN in MIX... Mix freq shift= 0.0 Hz Mix delay= -0.019 s FILTER saved to WSJ/chA/filter.wav NOISE saved to WSJ/chA/noise.wav TARGET saved to WSJ/chA/target.wav Input mix SNR= 15.58 dB
Warning: Directory already exists. ++++++ renoiser ++++++ Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000 Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_B.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_B from 16000 to 8000 Identifying CLEAN in MIX... Mix freq shift= 0.0 Hz Mix delay= -0.019 s FILTER saved to WSJ/chB/filter.wav NOISE saved to WSJ/chB/noise.wav TARGET saved to WSJ/chB/target.wav Input mix SNR= 6.04 dB
Warning: Directory already exists. ++++++ renoiser ++++++ Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000 Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_C.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_C from 16000 to 8000 Identifying CLEAN in MIX... Mix freq shift= 0.0 Hz Mix delay= -0.042 s FILTER saved to WSJ/chC/filter.wav NOISE saved to WSJ/chC/noise.wav TARGET saved to WSJ/chC/target.wav Input mix SNR= 6.23 dB
Warning: Directory already exists. ++++++ renoiser ++++++ Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000 Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_D.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_D from 16000 to 8000 Identifying CLEAN in MIX... Mix freq shift= -180.9 Hz Mix delay= -0.045 s FILTER saved to WSJ/chD/filter.wav NOISE saved to WSJ/chD/noise.wav TARGET saved to WSJ/chD/target.wav Input mix SNR= 3.53 dB
Warning: Directory already exists. ++++++ renoiser ++++++ Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000 Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_E.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_E from 16000 to 8000 Identifying CLEAN in MIX... Mix freq shift= 0.0 Hz Mix delay= -0.045 s FILTER saved to WSJ/chE/filter.wav NOISE saved to WSJ/chE/noise.wav TARGET saved to WSJ/chE/target.wav Input mix SNR= 0.93 dB
Warning: Directory already exists. ++++++ renoiser ++++++ Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000 Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_F.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_F from 16000 to 8000 Identifying CLEAN in MIX... Mix freq shift= 0.0 Hz Mix delay= 0.014 s Warning: reducing gain of filter (and noise, and target) by 0.59314 x to avoid clipping. FILTER saved to WSJ/chF/filter.wav NOISE saved to WSJ/chF/noise.wav TARGET saved to WSJ/chF/target.wav Input mix SNR= 2.98 dB
Warning: Directory already exists. ++++++ renoiser ++++++ Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000 Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_G.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_G from 16000 to 8000 Identifying CLEAN in MIX... Mix freq shift= 0.0 Hz Mix delay= -0.048 s FILTER saved to WSJ/chG/filter.wav NOISE saved to WSJ/chG/noise.wav TARGET saved to WSJ/chG/target.wav Input mix SNR= 18.66 dB
Warning: Directory already exists. ++++++ renoiser ++++++ Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000 Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_H.flac ... *** audioread: resampling 20110316_145021_recvrcali_default_H from 16000 to 8000 Identifying CLEAN in MIX... Mix freq shift= -120.7 Hz Mix delay= -0.044 s Warning: reducing gain of filter (and noise, and target) by 0.59648 x to avoid clipping. FILTER saved to WSJ/chH/filter.wav NOISE saved to WSJ/chH/noise.wav TARGET saved to WSJ/chH/target.wav Input mix SNR= 3.01 dB
Synthesis
% smooth background noise over a 1s window to remove % noise-correlated features laundersecs = 1.0; % Process a directory full of WSJ files, use the SNR from analysis renoiser('-cleanlist', list, '-mixoutdir', dirname, ... '-noise', noisename, '-filter', filtername, ... '-laundernoise', laundersecs, '-SNR', SNR, '-fshift', fshift); % also, extract the original for reference [d,sr] = audioread([droot,chan,'.flac'],TARGETSR); wavwrite(d(round(tst*sr)+[1:round((tend-tst)*sr)]),sr,... fullfile(dirname,'orig.wav'));
++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0101.wv2 ... Reading FILTER from WSJ/chA/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chA/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 15.5782 dB ... Analyzing/resynthesizing NOISE to launder it... MIX saved to WSJ/chA/4bja0101.wav ++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0102.wv2 ... Reading FILTER from WSJ/chA/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chA/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 15.5782 dB ... Analyzing/resynthesizing NOISE to launder it... MIX saved to WSJ/chA/4bja0102.wav *** audioread: resampling 20110316_145021_recvrcali_default_A from 16000 to 8000
++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0101.wv2 ... Reading FILTER from WSJ/chB/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chB/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 6.0418 dB ... Analyzing/resynthesizing NOISE to launder it... MIX saved to WSJ/chB/4bja0101.wav ++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0102.wv2 ... Reading FILTER from WSJ/chB/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chB/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 6.0418 dB ... Analyzing/resynthesizing NOISE to launder it... MIX saved to WSJ/chB/4bja0102.wav *** audioread: resampling 20110316_145021_recvrcali_default_B from 16000 to 8000
++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0101.wv2 ... Reading FILTER from WSJ/chC/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chC/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 6.2263 dB ... Analyzing/resynthesizing NOISE to launder it... MIX saved to WSJ/chC/4bja0101.wav ++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0102.wv2 ... Reading FILTER from WSJ/chC/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chC/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 6.2263 dB ... Analyzing/resynthesizing NOISE to launder it... MIX saved to WSJ/chC/4bja0102.wav *** audioread: resampling 20110316_145021_recvrcali_default_C from 16000 to 8000
++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0101.wv2 ... Reading FILTER from WSJ/chD/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chD/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 3.5312 dB ... Analyzing/resynthesizing NOISE to launder it... Applying frequency shift of -180.9459 Hz to output ... MIX saved to WSJ/chD/4bja0101.wav ++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0102.wv2 ... Reading FILTER from WSJ/chD/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chD/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 3.5312 dB ... Analyzing/resynthesizing NOISE to launder it... Applying frequency shift of -180.9459 Hz to output ... MIX saved to WSJ/chD/4bja0102.wav *** audioread: resampling 20110316_145021_recvrcali_default_D from 16000 to 8000
++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0101.wv2 ... Reading FILTER from WSJ/chE/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chE/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 0.92916 dB ... Analyzing/resynthesizing NOISE to launder it... MIX saved to WSJ/chE/4bja0101.wav ++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0102.wv2 ... Reading FILTER from WSJ/chE/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chE/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 0.92916 dB ... Analyzing/resynthesizing NOISE to launder it... MIX saved to WSJ/chE/4bja0102.wav *** audioread: resampling 20110316_145021_recvrcali_default_E from 16000 to 8000
++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0101.wv2 ... Reading FILTER from WSJ/chF/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chF/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 2.9821 dB ... Analyzing/resynthesizing NOISE to launder it... MIX saved to WSJ/chF/4bja0101.wav ++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0102.wv2 ... Reading FILTER from WSJ/chF/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chF/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 2.9821 dB ... Analyzing/resynthesizing NOISE to launder it... MIX saved to WSJ/chF/4bja0102.wav *** audioread: resampling 20110316_145021_recvrcali_default_F from 16000 to 8000
++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0101.wv2 ... Reading FILTER from WSJ/chG/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chG/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 18.6591 dB ... Analyzing/resynthesizing NOISE to launder it... MIX saved to WSJ/chG/4bja0101.wav ++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0102.wv2 ... Reading FILTER from WSJ/chG/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chG/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 18.6591 dB ... Analyzing/resynthesizing NOISE to launder it... MIX saved to WSJ/chG/4bja0102.wav *** audioread: resampling 20110316_145021_recvrcali_default_G from 16000 to 8000
++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0101.wv2 ... Reading FILTER from WSJ/chH/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chH/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 3.008 dB ... Analyzing/resynthesizing NOISE to launder it... Applying frequency shift of -120.6757 Hz to output ... MIX saved to WSJ/chH/4bja0101.wav ++++++ renoiser ++++++ Reading CLEAN from 4bj/4bja0102.wv2 ... Reading FILTER from WSJ/chH/filter.wav ... *** audioread: resampling filter from 8000 to 16000 Reading NOISE from WSJ/chH/noise.wav ... *** audioread: resampling noise from 8000 to 16000 Filtering CLEAN to produce target... Creating new output mix at SNR 3.008 dB ... Analyzing/resynthesizing NOISE to launder it... Applying frequency shift of -120.6757 Hz to output ... MIX saved to WSJ/chH/4bja0102.wav *** audioread: resampling 20110316_145021_recvrcali_default_H from 16000 to 8000
end
Results
You can listen to the results in the following table:
Original | Target | Noise | Renoised | |
Chan A | orig.wav | target.wav | noise.wav | 4bja0101.wav |
Chan B | orig.wav | target.wav | noise.wav | 4bja0101.wav |
Chan C | orig.wav | target.wav | noise.wav | 4bja0101.wav |
Chan D | orig.wav | target.wav | noise.wav | 4bja0101.wav |
Chan E | orig.wav | target.wav | noise.wav | 4bja0101.wav |
Chan F | orig.wav | target.wav | noise.wav | 4bja0101.wav |
Chan G | orig.wav | target.wav | noise.wav | 4bja0101.wav |
Chan H | orig.wav | target.wav | noise.wav | 4bja0101.wav |