Spectral Mutation in Soundhack

Larry Polansky

Bregman Electro-Acoustic Music Studio

Dartmouth College

larry.polansky@dartmouth.edu

http://onyx.dartmouth.edu/~larry/polansky.html

Tom Erbe

School of Music

California Institute for the Arts

tre@music.calarts.edu

http://music.calarts.edu/~tre

Abstract

Soundhack is a widely used Macintosh program that performs a wide variety of soundfile manipulations. The program, authored by Tom Erbe, includes soundfile type conversion, spectral dynamics processing, a varispeed/sample rate converter, soundfile convolution, ring modulation, a phase vocoder, a binaural filter and an amplitude analysis and gain change module.

With versions 0.8 and above of Soundhack, the authors have implemented a system of spectral mutation functions, which allow for a wide variety of transformations between soundfiles in the spectral domain. This article, largely drawn from the Soundhack manual (Erbe, 1994), describes the program's spectral mutation features in detail.

Introduction

Tom Erbe's Soundhack is a widely used program for soundfile manipulation on the Macintosh. Most of Soundhack's functions are based on a flexible FFT/IFFT process, with user control over the size, shape and resolution of the FFT and windowing. The program offers high quality soundfile transformations such as convolution, phase vocoder (allowing time stretching and pitch shifting), binaural filters, spectral dynamics processing, and other functions.

Larry Polansky's mutation functions (Polansky and McKinney, 1991; Polansky, 1992) are generalized morphological transformations which create mutations "between" morphologies (ordered sets). The mutation functions, originally designed for higher-level morphologies (such as melodies, durations series or statistical profiles of a parametric interval over time), considers intervals between elements of a morphology, and separate those intervals into magnitude and sign (direction).

Each of the mutation functions "pastes" or interpolates, in a different way, the sign or magnitude of one morphology into the sign or magnitude of the other, creating a third morphology which is some measurable combination of the two sources. In Soundhack Version 0.8 and above, the mutation functions are implemented in the spectral domain, where the individual FFT frames of soundfiles are considered to be the morphologies.

Introduction to the Mutation Functions

Each spectral mutation function produces a different timbral "cross-fade." Each function takes 2 soundfiles (source and target) and returns a third soundfile, (mutant). The functions operate on the phase/amplitude pair of each frequency band of the source and target spectra. The output of the functions is a phase/amplitude pair for each frequency band in the mutation soundfile.

Each output phase/amplitude pair in the mutation is some "combination" of the phase/amplitude pairs of the source and target, for the corresponding frequency band. The functions work on the sign (contour) or the magnitude of an interval, or both. They change completely a selected number of bands from the source to the target (irregular) or partially change all frames (uniform).

The following diagram shows interval computation for three successive FFT-frames of three spectral bands of one soundfile. Maximum amplitude for a band is 1.0. This computation uses adjacency intervals.

The degree of mutation is called , specified as a single value between 0 (completely the "source") to 1 (completely the "target") or as a function over the duration of the resultant soundfile. Intermediary values for create spectra which are "interpolations" (not necessarily linear) between the two input spectra. The mutation is thus a soundfile spectrally "between" the source and the target in some way (specified by the mutation type) and to some degree (specified by ..

Five different mutation functions are used in Soundhack, along with two concatenations of them. They are called: Uniform Signed Interval Mutation (USIM), Irregular Signed Interval Mutation (ISIM), Uniform Unsigned Interval Mutation (UUIM), Irregular Unsigned Interval Mutation (IUIM), and the Linear Contour Mutation (LCM). These functions are described in detail in Polansky (1992) and Polansky and McKinney (1991).

Each mutation function deals with the interval between the n[th]> and n+1[th ]frames (time slices) of the i[th] frequency band of a spectra. That is, the amplitude in a given band for a given frame of the mutation is a result of some transformation on the source and target amplitude intervals between the current frame and the previous. This is called adjacency interval.

In Soundhack, the user may select an alternative approach, absolute interval, which sets an absolute amplitude to which all intervals are taken (somewhat anchoring the mutation trajectory). The following diagram shows absolute interval computation for three successive FFT-frames of three spectral bands of one soundfile, absolute intervals computed to the value .5.

The Different Mutation Functions

The USIM and the ISIM are the simplest mutations, a spectral cross-fade and spectral band-replacement, respectively. The USIM ("uniform") operates on each spectral band interval for every FFT-frame, cross-fading amplitude differences between corresponding spectral bands. The ISIM replaces the amplitude interval of the source by that of the target for each frame to create that of the mutation (simply adding the signed interval to the amplitude of the previous spectral frame).

In irregular mutations, the decision of whether or not to mutate (replace the source amplitude by the target) is determined by , treated as a probabilistic value. In other words, the ISIM (and IUIM and LCM) is a stochastic spectral band replacement based on the value of .

The UUIM and the IUIM are two different ways of "pasting" or interpolating the amplitude differences of the target spectra onto the sign of the source, resulting in a mutation which, when completely mutated, is still some combination of the source and target. The UUIM interpolates between the source and target interval, but retains the sign, or direction of the source interval. A completely mutated spectra by the UUIM has the magnitude intervals of the target but the signs of the source intervals. The IUIM is the irregular version of the UUIM, pasting magnitude intervals stochastically, while retaining sign.

The LCM, perhaps the most unusual sounding and unpredictable mutation, does the opposite of the IUIM, pasting the signs of the target onto the magnitudes of the source, generally resulting in an unrecognizeable but often sonically interesting spectral transformation.

It is important to stress that even with = 1, the UUIM, IUIM, and LCM will not result in a mutation spectra that is the same as the target. Rather, the spectra will be some sign- or magnitude-equivalent "image" of the target spectra.

The mutation functions can be classified as follows:

Uniform Sign Magnitude

USIM yes yes yes

ISIM no yes yes

UUIM yes no yes

IUIM no no yes

LCM no yes no

Note that the opposite of "irregular" is "uniform." Uniform mutations do something to every frequency band. How much they do depends on . Irregular mutations change a given frequency band completely from source to target (sign, magnitude or both), but the number of frequency bands they operate on depends on . In irregular mutations, may be thought of as the "probability of replacement" for a given band (the percentage of bands replaced). In uniform mutations, functions as an interpolation value (degree of "between-ness"). Note that there is no UCM since it wouldn't make sense: contour is defined in the context of these mutations as being up, down, or equal.

Concatenated Mutations

Irregular mutations may be combined, or concatenated, with their complementary forms to produce more complete transformations. For example, the LCM, which only changes the sign of intervals, can be concatenated with either the UUIM or the IUIM, each of which transform interval magnitude. In the former each interval is crossfaded, in the latter some number of intervals are changed completely each time.

Concatenated mutations work like "pipes": for each frame the output of the first mutation is sent to the input of the next. For the LCM/IUIM, each part of the concatenation mutates an independent set of stochastically chosen spectral bands. Although the end result of the concatenated mutations LCM/IUIM and LCM/UUIM will be the same as the USIM or ISIM, the intermediate mutation trajectory will be different. The concatenated mutations tend to be jagged, since at least one of the mutations is operating stochastically.

An example of a concatenated mutation

The following examples are an attempt to visualize an LCM/IUIM mutation of a ramp function into a more complex one, a highly randomized triangular function. Individual values for these functions may be considered to be spectral band amplitudes for one FFT frame. That is, the first function (which might be thought of as one frame of a sawtooth wave) is mutated into the last (one frame of an extremely noisy triangle wave).

The first visualization is a conventional "waterfall" plot, where the z-axis is time (assuming the mutations are performed over time to effect the transformation of one FFT into another). The x-axis is FFT-band, and the y-axis is amplitude. This picture shows the result of ten intermediary mutations between the source and target. The LCM/IUIM uses an absolute index around the midpoint of the "sawtooth."

The second visualization shows only four of the intermediary mutations.

Theoretical Definitions of the Mutations

The mutation functions are mathematically defined below. S and T are the source and target soundfiles. Si, Ti are the amplitudes for a given frequency band of the FFT for the ith frame of the sound. Sj, Tj are either the amplitudes of the same band in the previous frame (Relative Interval) or some absolute amplitude (Absolute Interval). Mi is the new amplitude of the given frequency band of the current frame of the output sound, Mj is the amplitude for that band in the previous frame of the output sound (Relative Interval), or some absolute amplitude value between 0.0 and 1.0 (Absolute Interval). Tint, Sint and Mint are the signed magnitude intervals between the amplitude of the current frame for a given band, and the amplitude of that band in the previous frame (Relative Interval) or to some fixed amplitude (Absolute Interval).

Each equation applies to one frequency band of the source, target and mutant soundfile spectra. For all functions, Si, Ti, and Mi run from 0 to the number of frequency bands in the FFT.

The two basic interval functions (for sign and magnitude differences between the same frequency band of two FFT frames) are:

and

The five fundamental mutation functions are :

* Uniform Signed Interval Magnitude (USIM)

Mi = Mj + (Sint) + * (Tint - Sint)

* Uniform Unsigned Interval Magnitude (UUIM)

Mi = Mj + Ssgn * (Smag + * |Tmag - Smag|)

-- where Sint and Tint are

(Ssgn * Smag)

and

(Tsgn * Tmag) respectively

* Linear Contour Mutation (LCM)

Mi = Mj + Tsgn * Smag

(for mutated intervals)

Mi = Mj + Ssgn * Smag

(general form for non-mutated intervals)

* Irregular Unsigned Interval Magnitude (IUIM)

Mi = Mj + Ssgn * Tmag

(for mutated intervals; non-mutated intervals

same as LCM above)

* Irregular Signed Interval Magnitude (ISIM)

Mi = Mj + Tsgn * Tmag

(for mutated intervals; non-mutated intervals

same as LCM above)[]

In absolute interval mutations, Mj, the absolute amplitude to which the new interval is added, is interpolated between absolute values for source and target, according to the value for . The irregular mutations are in two forms: one for the case when that particular frequency band is chosen for mutation, one for when it is not.

Visual Examples of the Mutations

The example below shows the USIM from a sawtooth to a slightly and randomly rippled triangle.

The next seven examples show each of the mutations on the same two simple functions, plotted over up to ten gradually increasing values for from 0 to 1. In the case of the incomplete mutations (LCM, UUIM, IUIM) the final triangular function is not shown (nor arrived at). For the sake of visual clarity in some of the more complex mutations, fewer intermediary functions are shown. All mutations use absolute intervals of .5, with maximum amplitude 1.0.

The Mutation Functions Implemented in Soundhack

The following is a description of the main Soundhack spectral mutation function screen and each of its functions. The main screen is shown below.

Type

The Type box allows for selection between the seven quite different mutation functions: USIM (Uniform Signed Interval Mutation), ISIM (Irregular Signed), IUIM (Irregular Unsigned), UUIM (Uniform Unsigned), LCM (Linear Contour Mutation), and the concatenations LCM/IUIM and LCM/UUIM.

The simplest mutations are the USIM (a simple spectral crossfade) and the ISIM (a spectral replacement). These two mutations, unlike the UUIM and LCM, actually arrive at the source or target, depending on the direction of the mutation. That is, the source or target will actually be heard with = 0.0 or = 1.0, respectively). The IUIM, UUIM and the LCM will, with = 1, produce a "sign-equivalent" or "magnitude-equivalent" image of the target. These "incomplete mutations" mutate either the sign or the magnitude of intervals between successive spectral bands, but not both. That is, the mutation will have the same signs or magnitudes as the target, but not both.

As explained above, the concatenations (LCM/IUIM, LCM/UUIM) apply the second mutation to the output of the first. The concatenations are complementary: for each frame the LCM mutates sign while the IUIM and UUIM

mutate magnitude. Different frequency bands are used for each "stage" of the concatenation, so the mutation trajectory may be unpredictable.

Mutation Index ( )

determines the amount of spectral mix, from 0 to 1, between the source and target resulting in the mutant. = 0 results in all source file, = 1 all target. may vary over the course of the mutation. A constant index will result in a sound which is a spectral mix of the source and target. More dynamic sounds are produced with an index function, which changes over time.

Absolute Interval

There are two methods used by the mutation functions to compute intervals between frequency bands: Absolute (the default) and Relative. The Absolute Interval box may be checked or unchecked to get these two methods.

If Absolute intervals are used, absolute amplitude values may be specified between 0.0 and 1.0 for the source and target (Source Abs. Value, Target Abs. Value), from which all intervals will be taken. The choice of values can produce interesting effects, often "centering" the frequencies in which the mutation happens, making the mutations themselves less extreme. If two different values are used, amplitudes will be "transposed" from the source to the target. The use of Absolute intervals rather than Relative will be most noticeable for the LCM, IUIM and UUIM, as well as the concatenated mutations. Low values (between .1 and .2) are a good place to start (note that .1 means 1/10th of the total amplitude of the soundfile's spectra).

If Absolute is unchecked (Relative intervals), each mutation function uses amplitude intervals between successive frames of the spectra, multiplied by , to create the corresponding frame of the mutant sound. Relative interval mutations will tend to "drift," often in extreme ways, and the ISIM and concatenated mutations may never "arrive." However, some very interesting sonic results may be produced in this way.

Delta Emphasis

If the mutation uses Relative Intervals (Absolute Intervals unchecked), a value may be set for Delta Emphasis (DE). DE allows control over the degree to which successive mutation intervals are emphasized in the resulting mutant.

DE values range from -1.0 to 1.0, with the default at 0.0 (no emphasis or de-emphasis). For positive DE values, the current frame's intervalic characteristics will be emphasized more than the previous mutant frames. For negative values, the current frame will be "damped," emphasizing the previous information.

One way to think about this is as a way of "slowing down" the mutation: a negative DE value will keep the more chaotic mutations from getting "out of control." A negative DE value will function as a kind of low-pass filter, smoothing out large differences between subsequent spectral frames and averaging the previous spectral frames into the current output. Positive DE values will accentuate the often high-frequency activity of the mutations. Delta Emphasis can be useful in tailoring the relative interval mutations, especially "incomplete" ones.

Band Persist (Irregular Mutations Only)

Band Persist pertains only to irregular mutations: LCM, ISIM, IUIM and concatenations. It only appears on the screen when irregular mutations are selected. High values for Band Persist (towards 1.0) will produce more "stable" mutations. Low values (towards 0.0) will introduce a kind of frequency pumping at the frame rate. By changing the number of frequency bands, unusual sonic results are acheived.

As explained above, irregular mutations do not mutate every frequency band for each FFT frame. determines the percentage of bands that are mutated for a given frame. If a band is mutated, it completely assumes the particular characteristic (interval sign or magnitude) of the target interval, and retains either the sign or the magnitude of the source interval. For example, the LCM takes the sign of the target interval, and "pastes" it onto the magnitude of the source. However, it only does that for ( * #-of-bands). The selection of which bands to mutate in irregular mutations is done stochastically, but setting Band Persist high will ensure that once a band is mutated, it will keep being mutated as long as possible. That is why a high value for Band Persist will stabilize these highly unusual mutations, making them a bit more "well-behaved." A good experiment is to try an irregular mutation (LCM, IUIM, ISIM, the concatenations) with a fixed , and two different values of Band Persist, one high and one low.

Time Scale Target

The form of the mutation functions used in Soundhack requires that each soundfile (source, target, mutant) be of equal length. The default technique is to truncate the longer of the two files, producing a mutant which is the length of the shorter file. If Time Scale Target is checked, the target soundfile will be time-stretched or -compressed to be of the same length as the source. Other techniques are of course possible (including windowing and zero-padding), and we encourage other software developers to investigate these.

Acknowledgements

Various students at Dartmouth and Mills Colleges, including Chris Langmead, Eric Smith, Steve Berkley, Greg Higgs, Ted Apel and Martin McKinney have provided invaluable assistance over the past few years in helping to formulate both the spectral mutation ideas and accompanying software techniques. Ken Overton, Sergei Kossenko, and Chris Langmead all contributed valuable suggestions to this article. Susan Schwarz of Academic Computing at Dartmouth College helped create some of the figures, using Data Explorer on an RS-6000 computer. Larry Polansky was able to software-prototype some of the mutation functions during a 1994 National Endowment for the Arts sponsored residency at the Mills College Center for Contemporary Music. Dave Madole, Technical Director of the Mills College Center for Contemporary Music was especially helpful in these early stages of software development. Polansky expresses his deep appreciation to Chris Brown and John Bischoff of the Mills CCM for arranging and administering the residency.

References

Erbe, Tom. 1994. Soundhack Manual. Lebanon, NH: Frog Peak Music. (Section on spectral mutation written by L. Polansky)

Polansky, Larry, and McKinney, Martin. 1991. "Morphological Mutation Functions: Applications to Motivic Transformation and a New Class of Cross-Synthesis Techniques." Proceedings of the International Computer Music Conference. Montreal. pp. 234-241.

Polansky, Larry. 1992. "More on Morphological Mutation Functions: Recent Techniques and Developments." Proceedings of the International Computer Music Conference. San Jose. pp. 57-60.