Meapsoft



Computers Doing Strange Things with Audio

Tutorial
Manual
Javadoc
Doxygen

MEAPsoft Manual

This page provides a more detailed discussion of the components and function of MEAPsoft. For a quick illustration of how to get going, see the MEAPsoft Quick Start Walkthrough.

Segmenter

The segmenter analyzes the input sound file and outputs a list of segments representing events or beats present in the sound.

Controls:

input sound file: select the sound you would like to analyze. The filename of this sound will be used as the base name for all subsequent file i/o operations. You can change this basename (if, for instance, you want to save multiple versions of the segments file for one input sound) in the prefs/about panel.
detect events/detect beats: the segmenter has two different modes, "events" and "beats". "Events" mode simply detects sudden, substantial changes in the sound. The threshold for what qualifies as a substantial change is set via the segment sensitivity slider. "Beats" mode is more complex; the segmenter attempts to identify the tempo of the input sound and then outputs events that are aligned with that tempo. This will only work well for input sounds with fairly simple tempo/beat structures.

Event detector options
- segment sensitivity: a low sensitivity will result in fewer segments than a high sensitivity. Think of this as a sensitivity to change; if your sensitivity is high then everything will seem like an event.
- segment density: determines how closely spaced events can be; higher density will allow very closely spaced events, while lower density will ignore events that occur too close together.
Beat detector options
- cut tempo in half: if the detected beat is too fast, this option will slow it down.

1st event = track start: This tells the segmenter to always count the beginning of the track as the first event, even if there is very little energy there. If a sound starts with a fade in, for instance, and you don't have this checked, then the event detector will probably not detect that fade in as an event and the output segments file will not include that part of the sound.
output segment file: this is the temporary file name that will be used to save the segments file that is generated by the segmenter. If you want to save this file for further use, check the appropriate box in the prefs/about panel.

Feature Extractors

Feature extractors analyze the segments in a .seg file and output a features file containing one or more values representing the features found. Some feature extractors, like ChunkPower, simply put out one number representing the total power in each segment. Others, like AvgChroma, put out an array of values for each segment. For a short description of each feature extractor, hover above the feature extractor's name and a tooltip will pop up.

You can select as many feature extractors as you like, although best results are usually obtained by selecting features extractors that work well with the composers you'll feed the .feat file to. The number box beside each feature extractor is for entering weights that allow you to specify the relative importance of each feature in the analysis.

"Meta feature extractors" are a special class of FEs that do higher level analysis on the outputs of previously run "normal" feature extractors. In order to use a meta feature extractor, you need to select at least one normal feature extractor in addition to the meta feature extractor. If "clear non-meta features" is selected, the output features file will only contain the selected meta features.

The "Display extracted features" button is enabled after processing a segments file. It pops up a window with a simple viewer that allows you to inspect the values of the extracted features. This is useful if you need to set value ranges in a Composer.

Available feature extractors:

AvgChroma - 12-dimensional vector of energy distribution across each semitone of the octave.
AvgChromaScalar - Single value giving dominant semitone within the octave.
AvgChunkPower - Computes the average power in each chunk.
AvgFreqSimple - Provides a frequency estimation for each segment of sound.
AvgMelSpec - Computes the mean spectrum of a chunk and converts it to the perceptually weighted Mel frequency scale.
AvgMFCC - Computes the mean MFCCs of a chunk, a commonly used feature in speech recognition.
AvgPitchSimple - Provides a pitch estimation for each segment of sound.
AvgSpecCentroid - Computes the average spectral center of mass of a chunk's frames.
AvgSpecFlatness - Provides a measure of the peakiness of the chunks average spectrum.
AvgSpec - Computes the mean spectrum or each chunk.
ChunkLength - Simply returns the length of each segment.
ChunkStartTime - Simply returns the start time of each segment. Good for making backwards tracks!
Likelihood - MetaFeatureExtractor. Returns the likelihood of each chunk based on its current features. Lower numbers mean a segment is more common, higher numbers mean it is more distinct.
SpectralStability - Tracks the stability of the spectral energy within each chunk of sound. More spectrally stable chunks are more likely to be pitched material.

Composers

A composer takes a feature file as an input, analyzes/sorts/modifys the segments in that file, and then creates an Edit Decision List (EDL) representing the order in which the segments from the original source sound file (as well as others) should be arranged by the Synthesizer. Composers can be very simple or very complex. For example, "simple sort" simply sorts a features file by the first feature in each chunk. You could use this with AvgPitchSimple to generate a glissando where all of the pitches in the input sound are arranged from low to high. More complex Composers, like "MashUp" and "head bang" perform more sophisticated operations.

Each composer is described by a short text that appears when you select it. The controls for that composer (if any) will appear below the text.

Controls: Each composer's controls (if any) are different, and should be described in that composer's explanatory text. The following Universal Chunk Operations apply to all composers.

reverse: reverse audio in each chunk.
apply fade in/out: to avoid pops between segments you may want to apply a quick fade in/out at the boundaries of each chunk.
cross fade: slightly shift each segment so that their fade in/outs overlap. This generally results in smoother, less clicky sound.
fade length (ms): the duration of the fade in/out, in milliseconds.
apply gain value: adds a gain value to each segment in the output EDL. The synthesizer will use the gain value to scale the amplitude of the samples it writes to the output soundfile.
gain value: the gain value to add to each segment.

The "Display composed features" button is like the "Display extracted features" button above, but it works on the newly composed EDL file. This is convenient for inspecting the results of the composer. For instance, if you run the "simple sort" composer and display the EDL file you will see that the jumpy colors in the .feat file have been turned into a smooth fade in the sorted EDL.

Available composers:

BlipComposer - BlipComposer inserts a blip at the beginning of each chunk in the input features file. Especially useful for understanding the output of the segmenter.
EDLComposer - EDLComposer applies composer options (gain, crossfade, etc.) to an existing .edl file. It is meant to be used to generate output from the visualizers.
HeadBangComposer - HeadBangComposer rocks it hard-core style. Finds the most common chunk length L and lengths related by a factor of 2, i.e. L/2, L/4, L/8, L*2. These chunks are then shuffled to create a new piece with a clear beat.
HMMComposer - HMMComposer uses a features file to train a simple statistical model of a song and uses it to randomly generate a new sequence of chunks. This works best when used with chunks created by the beat detector.
IntraChunkShuffleComposer - IntraChunkShuffleComposer chops each chunk up into small pieces and rearranges them. This keeps the meta chunks intact but scrambles them on a local level.
MashupComposer - MashupComposer attempts to match chunks in the input features file using chunks from the chunk database features file. The result is the source sound file created from chunks in the chunk database.
MeapaeMComposer - MeapaeMComposer makes palindromes by writing each chunk of audio forward and then backward.
NNComposer - NNComposer starts at the first chunk and proceeds through the sound file from each chunk to its nearest neighbor, according to the features in the input features file.
RotComposer - RotComposer rotates the beats in each measure by a selectable number of positions. You can set the number of beats/measure, the number of positions to rotate, and the direction of rotation.
SortComposer - SortComposer sorts the features in ascending or descending order. If there are multiple features, or more than one value per feature, it sorts according to distance in Euclidean space.
ThresholdComposer - ThresholdComposer selects chunks with feature values falling inside the top and bottom thresholds. It then creates an output file composed exclusively of either the selected chunks or the not-selected chunks. ThresholdComposer only really makes sense for one-dimensional features like pitch and power.
VQComposer - VQComposer trains a vector quantizer on the chunks in the input file. It then uses it to quantize the chunks in another file. For best results use the beat segmenter so each chunk has roughly the same length.

Synthesizer

The Synthesizer uses the EDL from a composer to construct a new audio file. The output sound file name is automatically set to the input sound file name + MEAPED.wav. You can change this if you like. Once the new audio file has been created the "Listen" button will be active. Clicking on the button will launch your preferred .wav playback application.

Prefs/About

System wide preferences are set here.

file i/o base name: this is the prefix used when creating temp files during processing steps. You can usually leave this as is. The primary reason to change this is if you are saving the intermediate output files for further use/analysis.
audio player: select the program that will be used to play audio when clicking on the listen buttons in the segmenter and synthesizer tabs.
save .seg .edl .feat files: Normally intermediate output files are not saved. However, if you are going to process the same file a number of times, you can save time by saving some of the output files and reusing them on each pass. For instance, you might save the output of the segmenter and then on subsequent runs you can disable the segmenter and just use the saved .seg file instead of wasting time reanalyzing the file each time around.

Visualizer

The Visualizer presents data from .feat and .edl files in a variety of graphical formats. Mouse over a chunk to inspect its data in the "chunk data" frame on the left side of the screen. Click on chunks to select/deselect. Click and drag to select regions. Shift + click to add to selection.

The available visualizers are:

segement order: segments and feature values from the .feat file are at the top of the screen, segments and features from the corresponding .edl file are at the bottom. Multi-dimensional features are presented spectrogram style. The .feat file chunks are drawn in their original time order. The .edl chunks are drawn in their composed time order. A line is drawn from each .feat chunk to the corresponding .edl chunk (which is the same chunk shifted in time). "multi lines" and "thick lines" are just candy.
scatter plot: features are mapped to drawing parameters (x axis, y axis, height, width, color) and each chunk is drawn in the appropriate place. You might, for example, map start time to the x axis and pitch to the y axis to display pitch contours over time.
bar graph: one bar graph is drawn for each feature. "show" allows you to display the chunks in their original .feat time order or in their composed .edl time order. "height" allows you to map different values (start time, dest time, length, features) onto the height parameter.
line graph: really stupid!

General controls:

visualization type: select which visualization technique to use
load files: select and load a new .feat and .edl file (you can also load just a .feat file if you choose.)
save .feat/.edl: saves the selected chunks to a new .feat or .edl file
synthesizer: plays selected chunks in either .feat or .edl order.
zoom: zooms!
selection controls: mostly self-explanatory. "apply selection filter" allows you to select chunks numerically.


		Want Where Go To Want ®		meapsoft@music.columbia.edu