—> To Continue with Chapter 3

The Frequency Domain

Time domain representations show us a lot about the
amplitude of a signal at different points in time. Amplitude is a word that means, more or less, "how much of something," and in this case it might represent pressure, voltage, some number which measures those things, or even the in-out deformation of the eardrum.

For example, the time domain picture of the waveform below starts with the attack of the note, continues on to the steady state portion (sustain) of the note and ends with the cutoff and decay (release). We sometimes call the attack and decay transients, because they only happen once, and don't stay around! We also use the word transient, perhaps more typically, to describe timbral fluctuations during the sound that are irregular or singular. We use the word transient to distinguish between those kinds of sounds and the steady state.

From the picture below of a kind of typical sound event, we can tell something about how the sound’s amplitude develops over time (what we call its amplitude envelope). But from this picture we can't really tell much of anything about what is usually referred to as the timbre or "sound" of the sound: What instrument is it? What note is it? Is it bright or dark? Is it someone singing or a dog barking, or maybe a Back Street Boys bootleg? Who knows! These time domain pictures all look pretty much alike.

Figure .x A time-domain waveform. It's easy to see the attack, steady state, and decay portions of the "note" or sound event, because these are all pretty much variations of amplitude, which time domain representations show us quite well. The amplitude envelope is a kind of average of this picture.

We can even be a little more precise, and mathematical (why not?). If the amplitude at the nth sample in the above is A[n], and we make a new signal with amplitude, say, S[n], then the nth (of S[n], our envelope) sample will now be:

S[n] = (A[n-1] + A[n] + A[n+1])/3

— this would look like the envelope. This average operation is sometimes called smoothing or lowpass filtering. We'll talk more about it later.


Soundfile .x Monochord Sound
Soundfile .x Trumpet Sound

Two Sounds, and Two Different Kinds of Amplitude Envelopes

The two figures and sounds above (one of a trumpet, one of a one-stringed instrument called a monochord, made for us by Sam Torrisi), illustrate different ways of looking at amplitude and the time domain.

In each of the above, the time domain signal itself is given by the light blue area. This is exactly the same as what we showed you at the beginning of this section. But we've added two more envelopes to these figures, to illustrate two useful ways to think of a sound event.

The magenta line more or less follows the peaks of the signal, or its highest amplitudes. Note that it doesn't matter whether these amplitudes are very positive or very negative, all we really care about is their absolute value, which is more or less like saying how much energy or displacement (in either direction). Sometimes, we even simply this further and measure the peak-to-peak amplitude of a signal, just looking at the maximum range of amplitudes (this will tell us, for example, if our speakers/ears will be able to withstand the maximum of the signal). In the figure above, we look at some number of samples, and more or less remain on the highest value in that window (that's why it has a kind of staircase kind of look).

The dark blue line is a running average of the absolute value of the signal, which in effect, smooths it out tremendously, and also attenuates it. There's a similar measure, called RMS (root-mean-squared) amplitude, which tries to give an overall average of energy. Once again, we used a running window technique to average the last n number of samples (where n is the length of the window). Different values for n would give very different pictures.

Just to give you some idea how we do these kinds of graphs and measurements, we've included (in the Extra Bit) the computer code, written in a popular mathematical modelling program called MatLab, which made these pictures. By studying this code and the accompanying comments, you can get some idea of what computer music software often looks like, and how you might go about making similar kinds of measurements.

The Frequency/Amplitude/Time Plot

Distinguishing between sounds is one place where the frequency domain comes in. The picture below is a frequency/amplitude/time plot of the same sound as the time domain picture above. This new kind of sound-image is called a sonogram. Time still goes from left to right along the X-axis, but now the Y-axis is frequency, not amplitude. Amplitude is encoded in the intensity of a point on the image — the darker the point, the more energy present at that frequency at around that time (remember that all these frequency components make up one sound!).

For example, the semi-dark line around 7400 Hz shows that from about .05 seconds to .125 seconds there is some contribution to the sound at that frequency. This is occuring in the attack portion. It pretty much dies after a short period of time.

Figure .x This picture shows the same sound as that of the previous time domain figure, but now in the frequency domain as a sonogram. Here, the y-axis is frequency (or more accurately, frequency components). The darkness of the line indicates how much energy is present in that frequency component. The x-axis is, as usual, time.

What sorts of information do the last two pictures give us about the sound? Can you make some guesses about what sort of sound this might be?


What does this sonogram tell us about the sound? Remember that we said before that we use the
entire frequency range to determine timbre as well as pitch. As it turns out, any sound contains many smaller component sounds at a wide variety of frequencies (we’ll be more rigorous about this below, it's really important!). What you’re seeing in this sonogram is a representation of how all those component sounds change in frequency and amplitude over time.

Now listen to this:
Soundfile .x Mystery sound.
The sonogram shows that the mystery sound starts with a
burst of energy that is spread out across the frequency spectrum — notice the spikes that reach all the way up to the top frequencies in the image. Next it settles down into a fairly constant, more concentrated and lower energy state, where it remains until the end when it quickly fades out. This is a pretty common description of a vibrating system: start it vibrating out of its rest state (chaotic, loud), listen to it settle into some sort of regular vibratory behavior, and then, if the energy source is removed (e.g., you stop blowing the horn or take your e-bow off your electric guitar string), listen to it decay (again, chaotic).

The presence of a band of high amplitude, low frequency energy, coupled with some lower amplitude, high frequency energy implies that we’re looking at some sort of pitched sound with a number of strong harmonics. The darkest low band is probably the fundamental note of the sound. By studying the sonogram, can you get a mental idea of what sort of sound it might be?

Listen to the sound a few times while watching the waveform and sonogram images. Can you follow along? Is there a clear correlation between what you see and what you hear? Does the sound look the way it sounds? Do you agree that the sonogram gives you a more informative visual representation of the sound? Isn’t the frequency domain cool?

Computer Code for Making Amplitude Envelope Pictures


Figure .x Song of the hooded warbler.

This is another kind of sonogram, kind of like a negative image of the sound moving in pitch (the y-axis) over time. The thickness of the line shows a lot about the pitch range. What this old style sonogram did was try to find the maximum energy concentration and give a picture of the moving pitch of a sound, natural or otherwise.

Sometimes pictures like this, which were very common a long time ago, are called melograms, or melographs, because they graph pitch in time. We got this wonderful picture out of an old book about recording natural sounds!

Note: No hooded warblers were harmed or annoyed in the making of this picture.

Soundfile .x The song of the hooded warbler. Can you follow it with the picture above?


Figure .x Just for historical interest, the picture above is an example of an old process called phonophotography, an early (1920s) method for capturing a graphic image of a sound. It's essentially a melographic technique.

What we are looking at is a picture of a "recording" of a performance of the gospel song, "Swing Low, Sweet Chariot." This color image came from the work of a brilliant researcher named Metfessel.

This kind of highly descriptive analysis greatly influenced music theorists in the first part of the 20th century, Many people saw it as a kind of revolutionary mechanism for describing sound and music, potentially removing music analysis from the realm of the aesthetic, the emotional, and the transcendental into a more modernist, scientific, and objective domain.

There is no permission for this.

—> To Continue with Chapter 3

<— Back to 2.8

<— To Table of Contents