—> To Chapter 2

Timbre

What’s the difference between a tuba and a flute (or, more accurately, the sounds of each)? How do we tell the difference between two people singing the same song, even if they’re singing exactly the same notes? Why do some guitars "sound" better than others do (besides the fact that they're older and cost more and have Eric Clapton's autograph on them)? What is it that makes things "sound" like themselves?

It’s not necessarily the pitch of the sound (how high or low it is) — if everyone in your family sang the same note, you could almost surely tell who was who, even with your eyes closed (it's not by smell, or even just the fact that grandma sings out of tune). It’s also not just the loudness — your voice is still your voice whether you talk softly or scream at the top of your lungs. So what’s left? The answer is found in a somewhat mysterious and elusive thing we call, for lack of a better word timbre, and that's what this chapter is all about.

Actually, we have to admit that timbre (pronounced "tam-ber", and not "tim-ber" as in "look out below"!) is a kind of sloppy word, inherited from previous eras, that lumps together lots and lots of things that we don't fully understand. One of us thinks we should abandon that word and concept entirely, and compares it to the nineteenth century concept of the ether, through which lightwaves were supposed to travel. But it's one of those words that gets used a lot, even if it doesn't make much sense, so we'll use it here too — we're sort of stuck with it for the time being. Maybe the best definition for it is everything that's not pitch and amplitude (but even that doesn't work, because pitch and amplitude are part of timbre too... okay, we'll stop now).

What Makes Up Timbre?

This applet lets you draw a periodic waveform. How much does the shape of the waveform (which is a result of its steady-state spectra) influence what you hear? Some shapes seem to sound brighter than others, some duller.

We'll see, as we learn more about timbre, that the periodic waveform shape) may not be all that important when it comes to us recognizing, distinguishing and identifying sounds. This is somewhat surprising, and contrary to 100 years of popular assumption. But you may notice, after all, that in redrawing the waveshaping, you might not feel that you're creating a whole new sound event, just a kind of different "buzz."

Installed

As we mentioned, timbre can be roughly defined as all those qualities of a sound that aren’t just frequency or amplitude. These qualities might include:

  • Spectra: the aggregate of simpler waveforms (usually sine waves) that make up what we recognize as a particular sound. This is what Fourier analysis gives us (we'll discuss that in Chapter 3).
  • Envelope: the attack, sustain, and decay portions of a sound (often referred to as transients).

Envelope and spectra are very large concepts, and include a lot of sub-categories. For example, spectral features are very important, different ways that the spectral aggregates are organized statistically, in terms of shape and form. For example the relative "noisiness" of a sound is a result, in large part, of its spectral relationships Many facets of envelope (onset time, harmonic decay, spectral evolution, steady-state modulations) are not simply explained by just looking at the envelope of a sound. Researchers spend a great deal of time on very specific aspects of these ideas, and it's an exciting and interesting area for computer musicians to research.

Here's an image of the attack, sustain, and decay of a trumpet tone. Below is a simplified picture of the envelope.
Attack Sustain Decay-------------------------->
Figure x

This image illustrates the attack, sustain, and decay portions of a standard amplitude envelope. This is a very simple, sort of idealized picture (called a trapezoidal envelope). We are not aware of any actual, natural occurrence of these kinds of straight-lined sounds!

It's helpful here to bring another descriptive term into our vocabulary: spectrum. Spectrum is defined by a waveform's distribution of energy at certain frequencies. The combination of spectra (plural of spectrum) and envelope help us to define the "color" of a sound. Timbre is difficult to talk about, because it’s hard to measure something subjective like the "quality" of a sound. This concept gives music theorists, computer musicians, and psychoacousticians a lot of trouble. However, computers have helped us make great progress in the exploration and understanding of the various components of what's been called, traditionally, timbre.

The basic elements of sound — sinewaves

As we've shown, your average piece of music can be a pretty complicated function. Nevertheless, it's possible to think of it as a combination of much more simple sounds (and hence more simple functions), even more simple than individual instruments. These basic atoms of sound, the sinusoids (sinewaves) we talked about in the previous sections, are sometimes called
pure tones, like those produced when a tuning fork vibrates. We use the tuning fork to talk about these tones because it is one of the simplest, and easiest to understand, physical vibrating systems.

Although you might think that a discussion of tuning forks belongs more in a discussion of frequency (since that's pretty much all they do), we're going to use them to introduce the notion of sinusoids, Fourier components of a sound.

Figure x

The tuning fork seen above rings at 256 Hz. You can hear the sound it makes by pressing on the waveform button to the right.

Compare the tuning fork sound to that of a pure sinewave at the same frequency of 256 Hz.

Tuning fork at 256 Hz.
Sine Wave at 256 Hz.

Tuning Forks

There is no permission for this photo of Einstein.
When you hit the tines of the tuning fork, it vibrates and emits a very pure note or tone. Tuning forks are able to vibrate at very precise frequencies. The frequency of a tuning fork is the number of times the tip goes back and forth in a second. And this number won't change, no matter how hard you hit that fork. As we mentioned, the human ear is capable of hearing sounds that vibrate all the way from 20 times a second, to those that vibrate 20,000 times a second. Low frequency sounds are like bass notes, and high frequency sounds are like treble notes. Low frequency means that the tines vibrate slowly, high frequency means that they vibrate quickly.

Figure x

Click on the different tuning forks above and see the different audiograms. Notice what they have in common! They are all roughly the same shape — simple waves that differ only in the width of the regularly repeating peaks. The higher tones give more peaks over the same interval. In other words, the peaks occur more frequently (get it? higher frequency!).

Figure x

When you whack the tines of the tuning fork (those are the tips of the "U"-shaped thing that makes up the tuning fork), it vibrates. The number of times the tines go back and forth in one second determines the frequency of a particular tuning fork.

Hit the tuning fork graphic above. You'll hear composer Warren Burt's piece for tuning forks, Improvisation in Two Ancient Greek Modes.

Now, why do the tuning fork functions have their simple, sinusoidal shape? Think about how the tip of the tuning fork is moving over time. We see that it is moving back and forth, from its greatest displacement in one direction, all the way back to just about the same displacement in the opposite direction. Imagine that you are sitting on the end of the tine (hold on tight!). When you move to the left that will be a negative displacement, and when you move to the right that will be a positive displacement. Once again, as time goes on we can graph the function that at each moment in time outputs your position. Your back and forth motion yields the functions many of you remember from trigonometry class: sines and cosines.

Sine and cosine wave.

Thanks to Wayne Mathews for this image.

It turns out that any sound can be represented as a combination of different amounts of these sines and cosines of varying frequencies. If only we had enough tuning forks, we wouldn't need computers. The mathematical subject that explains sounds and other wave phenomena is called
Fourier analysis, named after its discoverer, the great 18th century mathematician Fourier.



Figure x

Spectra of (a) sawtooth wave; (b) square wave.

These pictures show the relative amplitudes of sinusoidal components of simple waveforms.

For example, Figure (a) indicates that a square wave can be made by addition in the following way: 1 part of a sinewave at the fundamental frequency (say, 1 Hz), then 1/2 as much of a sinewave at 2Hz, and 1/3 as much at 3 Hz, and so on and so on and so on and so on .... (we have to do this infinitely, but we won't write infinitely many "and so ons").

Later, in Section 4.2, we'll talk about using this technique in synthesizing sound, called additive synthesis. If you want to jump ahead a bit, try the applet in 4.2 which lets you build simple waveforms from sinusoidal components. Notice, when you try to build a square wave, that there are little ripples on the edges of the square. This is called Gibbs ringing, and has to do with the fact that the sum of any finite number of these decreasing amounts of sinewaves of increasing frequency is never exactly a square wave.

But what all these charts mean is that if you add up all those sinusoids, whose frequencies are integer multiples of the fundamental frequency of the sound, and whose amplitudes are described above by the height of the bar, you'll get the sawtooth and square wave.

This is what Fourier analysis is all about: every periodic waveform (which is the same, more or less, as saying every pitched sound) can be expressed as a sum of sines whose frequencies are integer multiples of the fundamental, and whose amplitudes are unknown (that's the fun part). These charts are called spectral histograms (they don't show any evolution over time, since these waveforms are periodic!).

These sinewaves are sometimes referred to as the spectral components, partials, overtones, or harmonics of a sound, and are what was thought, in the old days, to be primarily responsible for our sense of timbre. So when we refer to the 10th partial of a timbre, we mean a sinusoid at 10 times the frequency of the sound's fundamental frequency (but we don't know its amplitude).

The sounds below are conventional instruments with their attacks lopped off, so that we can hear each instrument as a different periodic waveform, and listen to their different spectral configurations. Strangely enough, the clarinet (which is a lot like a sawtooth wave) and the flute, without their attacks, are not all that different (in the grand scheme of things).

Figure .x Clarinet sound.
Figure .x Clarinet with attack lopped off.
Figure .x Flute
Figure .x Flute with attack lopped off.
Figure .x Piano
Figure .x Piano with attack lopped off.
Figure .x Trombone (not played under water!)
Picture courtesy of: Bob Hovey http://www.TROMBONISTICALISMS.bigstep.com/
Figure .x Trombone with attack lopped off.
Figure .x Violin
Figure .x Violin with attack lopped off.
Figure .x Voice
Figure .x Voice with attack lopped off.
photo courtesy of www.bobhoose.com

—> To Chapter 2

<— Back to 1.3 Frequency, Pitch and Intervals

<— To Table of Contents