Go to: Title Page Chapter 1 Appendix A Appendix J Copyright Chapter 2 Appendix B Appendix K Abstract Chapter 3 Appendix C Appendix L Acknowledgements Chapter 4 Appendix D References Table of Contents Chapter 5 Appendix E List of Figure Conclusions/ Appendix F List of Tables Future Directions Appendix G List of Audio Examples Appendix H List of Programs Appendix I
Examining Committee:
(chairman) Larry Polansky
Charles Dodge
Geoff Davis
Edward Michael Berger
Dean of Graduate Studies
This thesis documents original applications of wavelet theory to digital audio signal processing for electro-acoustic musical purposes. A non-mathematical introduction to wavelets for the musician is given first, along with a brief description of previous work in wavelet theory and audio signal processing. A mathematical summary of multiresolution analysis, the heart of wavelet theory, is then presented. Three original applications of selected theoretical constructs in wavelet theory are then proposed and implemented. Specifically, the concepts of second-generation wavelet shrinkage, exponential decay of wavelet coefficients, and wideband-octave frequency decomposition are used to construct algorithms which denoise, spectrally enhance, and wideband-equalize digital audio, respectively.
A secondary goal of this thesis is to provide both the end-user
and the C++ programmer with documented, publicly available tools
that accomplish these audio-related, wavelet-based tasks. To this
end, the MATLAB and GNU C++ environments are used to prototype
and implement the three algorithms mentioned above on a Silicon
Graphics Indy workstation. Audio examples of the output of these
programs are presented along with each algorithm. Appendices document
the development of the MATLAB code and C++ libraries which were
used for these experiments. C++ classes for audio file (AIFF file
format) I/O, interpolating and non-interpolating delay-line processing,
real-time digital filtering (simple time-domain convolution),
and real-time audio processing are discussed. In addition, a failed
but instructional attempt at a C++ class that models a proposed
wavelet file format (WAIFF - Wavelet Audio Interchange File Format)
is discussed.
The author wishes to thank Professors Larry Polansky, Charles
Dodge, Jon Appleton, Eric Hansen, and Geoff Davis for all of their
help with the technical and conceptual development of the ideas
contained in this thesis.
In addition, the author wishes to thank fellow graduate students
Martin Dupras, Timothy Polashek, and Scott Lawrence for all of
their informal discussions on audio and musical signal processing.
This thesis is dedicated to Angelo S. DiDomenico of Framingham
(North) High School, Framingham, Massachusetts, who will be retiring
at the end of the 1996 academic year. ìThe rest is trivial.î
| Abstract | ii | |
| Acknowledgments | iii | |
| Table of Contents | iv | |
| List of Figures | vi | |
| List of Tables | vii | |
| List of Audio Examples on accompanying CD | viii | |
| List of Programs | ix |
| Chapter 1 | Non-mathematical, introductory analogy of wavelet theory to fugue analysis and storywriting for the musician | 1 |
| Chapter 2 | Non-mathematical, introductory analogy of wavelet theory to fugue analysis and storywriting for the musician | 23 |
| Chapter 3 | Mathematical summary of multiresolution analysis and relevant wavelet theory | 45 |
| Chapter 4 | Second generation wavelet shrinkage: a comparison of the lifting scheme and circular convolution in the denoising of digital audio | 59 |
| Chapter 5 | Using the wideband-octave decomposition of wavelet analysis to achieve wideband equalization of digital audio | 71 |
| Conclusions and Future Directions | 90 |
| Appendix A | Technical notes on the equipment and software used in this thesis. | 92 |
| Appendix B | Guidelines on compiling and using the software utilities created for this thesis | 95 |
| Appendix C | Forward and inverse wavelet transform code for both the lifted and circular convolution methods | 96 |
| Appendix D | C++ source code for programs used in Chapter 3 | 117 |
| Appendix E | C++ source code for program used in Chapter 4 | 127 |
| Appendix F | C++ source code for program used in Chapter 5 | 134 |
| Appendix G | class Aiff_file, code for a C++ class which models AIFF (Audio Interchange File Format) I/O | 152 |
| Appendix H | class Waiff_file, code for a C++ class which models a possible wavelet file format (WAIFF - Wavelet Audio Interchange File Format) | 224 |
| Appendix I | class Audio_port, code for a C++ class which models real-time audio input/output | 251 |
| Appendix J | class Delayline, code for a C++ class which models an interpolating and non-interpolating delay line | 267 |
| Appendix K | class Filter, code for a C++ class which models a real-time, time-domain digital filtering (convolution) of digital audio | 290 |
| Appendix L | Prototype MATLAB code with examples | 302 |
| References | 339 |
| Figure | Description | |
| 1.1 | Comparison of different time/frequency resolutions in Fourier Analysis. | |
| 1.2 | Fourier analysis of a sharp transient. | |
| 1.3 | Wavelet-based analysis of a sharp transient. | |
| 1.4 | Mother wavelet and daughter wavelet shapes for different levels of a wavelet analysis. | |
| 1.5 | J.S. Bach, Von Himmel Hoch, Variation No. 4 (manuals only).
Illustration of augmentation in a fugue. | |
| 1.6 | J.S. Bach, The Art of the Fugue, Canon No. 1.
Illustration of augmentation in a lower register than the theme's register. | |
| 1.7 | J.S. Bach, The Art of the Fugue, Fugue No. 6.
Graphical comparison of augmentation in a lower register to wavelet analysis. Overlay of wavelet grid on standard musical notation. | |
| 1.8 | Non-fugal example of wavelet-like analysis using standard musical notation | |
| 1.9 | Wavelet analysis of music in Figure 1.8 | |
| 1.10 | Construction of a simple story with text-based scaling functions. | |
| 2.1 | Recursive, digital filterbank configurations for the forward and inverse dyadic wavelet transforms. | |
| 3.1 | Denoising using wavelet analysis | |
| 4.1 | Exponential decay of wavelet coefficients across analysis levels. | |
| 4.2 | Spectral enhancement algorithm. | |
| 5.1 | Wavelet-based equalization | |
| 5.2 | Time-domain view of level 0 analysis coefficients using the 5/3 binary coefficient filterbank; produced using the equalize program. | |
| 5.3 | Time-domain view of level 0 analysis coefficients using the 6/10 binary coefficient filterbank; produced using the equalize program. | |
| 5.4 | Time-domain view of level 1 analysis coefficients using the 5/3 binary coefficient filterbank; produced using the equalize program. | |
| 5.5 | Time-domain view of level 1 analysis coefficients using the 6/10 binary coefficient filterbank; produced using the equalize program. | |
| Conc. 1 | A proposed waveshaping instrument that uses wavelet analysis as a control mechanism for the transfer function (lookup table) |
| Table | Description | |
| 2.1 | Wavelet coefficients for several analysis and resynthesis filters used in this thesis | |
| 2.2 | Number of vanishing moments for the filters listed in Table 2.1 | |
| 3.1 | List of audio examples using circular convolution denoising | |
| 3.2 | List of audio examples using the lifted, interpolating wavelet denoising algorithm | |
| 4.1 | List of audio examples using the spectral enhancement algorithm | |
| 5.1 | List of audio examples using the equalize program. Listening to individual levels of wavelet coefficients. | |
| 5.2 | List of audio examples using the equalize program.
Time-varying equalization for compositional purposes. | |
| C.1 | Structure of an input data block to flwt(),a C++ function that performs the fast forward, lifted wavelet transform. | |
| C.2 | Structure of an output data block processed by flwt(), a C++ function that performs the fast forward, lifted wavelet transform. | |
| C.3 | Structure of an input data block to iflwt(),a C++ function that performs the fast inverse, lifted wavelet transform. | |
| C.4 | Structure of an output data block processed by iflwt(),
a C++ function that performs the fast inverse, lifted wavelet transform. | |
| C.5 | Structure of an input data block to fwt(), a C++ function that performs the fast forward, circular convolution wavelet transform. | |
| C.6 | Structure of an output data block processed by fwt(),
a C++ function that performs the fast forward, circular convolution wavelet transform. | |
| C.7 | Structure of an input data block to ifwt(), a C++ function that performs the fast inverse, circular convolution wavelet transform. | |
| C.8 | Structure of an output data block processed by ifwt(),
a C++ function that performs the fast inverse, circular convolution wavelet transform. | |
| F.1 | A wavelet-based graphic equalizer's band characteristics | |
| G.1 | Names and descriptions of C++ programs that use
class Aiff_file | |
| H.1 | Names and descriptions of C++ programs that use
class WAiff_file | |
| I.1 | Names and descriptions of C++ programs that use
class Audio_port | |
| J.1 | Names and descriptions of C++ programs that use
class Delayline | |
| K.1 | Names and descriptions of C++ programs that use
class Filter |
I. Denoising using binary wavelets
| CD Track # | Audio example | Artist | window size | iterations | threshold |
| 1 | 3.1 | Martin Dupras | original | noisy | sample |
| 2 | 3.2 | Martin Dupras | 32768 | 2 | 100 |
| 3 | 3.3 | Martin Dupras | 32768 | 4 | 100 |
| 4 | 3.4 | Martin Dupras | 32768 | 8 | 100 |
| 5 | 3.5 | Martin Dupras | 32768 | 12 | 30 |
| 6 | 3.6 | Martin Dupras | 32768 | 12 | 70 |
| 7 | 3.7 | Martin Dupras | 32768 | 12 | 200 |
| 8 | 3.8 | Martin Dupras | 32768 | 12 | 5000 |
II. Denoising using lifted, interpolated wavelets
| CD Track # | Audio example | Artist | window size | iterations | vanishing index | vanishing moments = (2 * vanishing index) - 1 | threshold |
| 9 | 3.9 | Martin Dupras | 32768 | 2 | 2 | 3 | 100 |
| 10 | 3.10 | Martin Dupras | 32768 | 4 | 2 | 3 | 100 |
| 11 | 3.11 | Martin Dupras | 32768 | 8 | 2 | 3 | 100 |
| 12 | 3.12 | Martin Dupras | 32768 | 12 | 2 | 3 | 30 |
| 13 | 3.13 | Martin Dupras | 32768 | 12 | 2 | 3 | 70 |
| 14 | 3.14 | Martin Dupras | 32768 | 12 | 2 | 3 | 200 |
| 15 | 3.15 | Martin Dupras | 32768 | 12 | 2 | 3 | 5000 |
| 16 | 3.16 | Martin Dupras | 32768 | 12 | 3 | 5 | 100 |
| 17 | 3.17 | Martin Dupras | 32768 | 12 | 4 | 7 | 100 |
III. Spectral Enhancement / Excitation
| CD Track # | Audio example | Artist / title of excerpt | version |
| 18 | 4.1 | Peter Gabriel / In Your Eyes | original 44.1kHz file |
| 19 | 4.2 | normalized 11.025 kHz file | |
| 20 | 4.3 | normalized, spectrally enhanced 44.1kHz file | |
| 21 | 4.4 | Disney, Little Mermaid /
Kiss the Girl | original 44.1kHz file |
| 22 | 4.5 | normalized 11.025 kHz file | |
| 23 | 4.6 | normalized, spectrally enhanced 44.1kHz file | |
| 24 | 4.7 | Peter Gabriel / Red Rain | original 44.1kHz file |
| 25 | 4.8 | normalized 11.025 kHz file | |
| 26 | 4.9 | normalized, spectrally enhanced 44.1kHz file | |
| 27 | 4.10 | J.C. Risset / Inharmonique | original 44.1kHz file |
| 28 | 4.11 | normalized 11.025 kHz file | |
| 29 | 4.12 | normalized, spectrally enhanced 44.1kHz file | |
| 30 | 4.13 | Murder, Inc. / Mania | original 44.1kHz file |
| 31 | 4.14 | normalized 11.025 kHz file | |
| 32 | 4.15 | normalized, spectrally enhanced 44.1kHz file | |
| 33 | 4.16 | Killing Joke / White Out | original 44.1kHz file |
| 34 | 4.17 | normalized 11.025 kHz file | |
| 35 | 4.18 | normalized, spectrally enhanced 44.1kHz file | |
| 36 | 4.19 | Murder, Inc. /
Mr. Whiskey's Name | original 44.1kHz file |
| 37 | 4.20 | normalized 11.025 kHz file | |
| 38 | 4.21 | normalized, spectrally enhanced 44.1kHz file | |
| 39 | 4.22 | Corey Cheng / Woods | original 44.1kHz file |
| 40 | 4.23 | normalized 11.025 kHz file | |
| 41 | 4.24 | normalized, spectrally enhanced 44.1kHz file |
IV. Listening to individual levels of wavelet coefficients
| CD
Track # | Audio example | Artist / Title | Description | Approximate Band characteristics of this level of coefficients at a sampling rate of 44.1kHz (given in Hz) | Control file used | Wavelet Filters used / vanishing moments (see tables 2.1 and 2.2) |
| 42 | 5.1 | (White noise) | original file | |||
| 43 | 5.2 | " | level 0 "final average"
coeffs only | 0 - 689.062 | solo.level0_of_5.eq | 6/10, 2/2 |
| 44 | 5.3 | " | level 1 wavelet
coefficients only | 689.062 - 1378.12 | solo.level1_of_5.eq | 6/10, 2/2 |
| 45 | 5.4 | " | level 2 wavelet
coefficients only | 1378.12 - 2756.25 | solo.level2_of_5.eq | 6/10, 2/2 |
| 46 | 5.5 | " | level 3 wavelet
coefficients only | 2756.25 - 5512.5 | solo.level3_of_5.eq | 6/10, 2/2 |
| 47 | 5.6 | " | level 4 wavelet
coefficients only | 5512.5 - 11025.0 | solo.level4_of_5.eq | 6/10, 2/2 |
| 48 | 5.7 | " | level 5 wavelet
coefficients only | 11025.0 - 22050.0 | solo.level5_of_5.eq | 6/10, 2/2 |
| 49 | 5.8 | Hootie and the Blowfish /
Only Wanna Be With You | original file | |||
| 50 | 5.9 | " | level 0 "final average"
coeffs only | 0 - 689.062 | solo.level0_of_5.eq | 6/10, 3/3 |
| 51 | 5.10 | " | level 1 wavelet
coefficients only | 689.062 - 1378.12 | solo.level1_of_5.eq | 6/10, 3/3 |
| 52 | 5.11 | " | level 2 wavelet
coefficients only | 1378.12 - 2756.25 | solo.level2_of_5.eq | 6/10, 3/3 |
| 53 | 5.12 | " | level 3 wavelet
coefficients only | 2756.25 - 5512.5 | solo.level3_of_5.eq | 6/10, 3/3 |
| 54 | 5.13 | " | level 4 wavelet
coefficients only | 5512.5 - 11025.0 | solo.level4_of_5.eq | 6/10, 3/3 |
| 55 | 5.14 | " | level 5 wavelet
coefficients only | 11025.0 - 22050.0 | solo.level5_of_5.eq | 6/10, 3/3 |
| CD
Track # | Audio example | Artist / Title | Description | Approximate Band characteristics of this level of coefficients at a sampling rate of 44.1kHz (given in Hz) | Control file used | Wavelet Filters used / vanishing moments (see tables 2.1 and 2.2) |
| 56 | 5.15 | Bye Bye Birdie Soundtrack / Honestly Sincere | original file | |||
| 57 | 5.16 | " | level 0 "final average"
coeffs only | 0 - 689.062 | solo.level0_of_5.eq | 6/10, 3/3 |
| 58 | 5.17 | " | level 1 wavelet
coefficients only | 689.062 - 1378.12 | solo.level1_of_5.eq | 6/10, 3/3 |
| 59 | 5.18 | " | level 2 wavelet
coefficients only | 1378.12 - 2756.25 | solo.level2_of_5.eq | 6/10, 3/3 |
| 60 | 5.19 | " | level 3 wavelet
coefficients only | 2756.25 - 5512.5 | solo.level3_of_5.eq | 6/10, 3/3 |
| 61 | 5.20 | " | level 4 wavelet
coefficients only | 5512.5 - 11025.0 | solo.level4_of_5.eq | 6/10, 3/3 |
| 62 | 5.21 | " | level 5 wavelet
coefficients only | 11025.0 - 22050.0 | solo.level5_of_5.eq | 6/10, 3/3 |
| 63 | 5.22 | Sting / Fields of Gold | original file | |||
| 64 | 5.23 | " | level 0 "final average"
coeffs only | 0 - 689.062 | solo.level0_of_5.eq | 6/10, 3/3 |
| 65 | 5.24 | " | level 1 wavelet
coefficients only | 689.062 - 1378.12 | solo.level1_of_5.eq | 6/10, 3/3 |
| 66 | 5.25 | " | level 2 wavelet
coefficients only | 1378.12 - 2756.25 | solo.level2_of_5.eq | 6/10, 3/3 |
| 67 | 5.26 | " | level 3 wavelet
coefficients only | 2756.25 - 5512.5 | solo.level3_of_5.eq | 6/10, 3/3 |
| 68 | 5.27 | " | level 4 wavelet
coefficients only | 5512.5 - 11025.0 | solo.level4_of_5.eq | 6/10, 3/3 |
| 69 | 5.28 | " | level 5 wavelet
coefficients only | 11025.0 - 22050.0 | solo.level5_of_5.eq | 6/10, 3/3 |
| 70 | 5.29 | Sting / Fields of Gold | original file | |||
| 71 | 5.30 | " | level 0 "final average"
coeffs only | 0 - 689.062 | solo.level0_of_5.eq | 5/3, 2/2 |
| 72 | 5.31 | " | level 1 wavelet
coefficients only | 689.062 - 1378.12 | solo.level1_of_5.eq | 5/3, 2/2 |
| 73 | 5.32 | " | level 2 wavelet
coefficients only | 1378.12 - 2756.25 | solo.level2_of_5.eq | 5/3, 2/2 |
| 74 | 5.33 | " | level 3 wavelet
coefficients only | 2756.25 - 5512.5 | solo.level3_of_5.eq | 5/3, 2/2 |
| 75 | 5.34 | " | level 4 wavelet
coefficients only | 5512.5 - 11025.0 | solo.level4_of_5.eq | 5/3, 2/2 |
| 76 | 5.35 | " | level 5 wavelet
coefficients only | 11025.0 - 22050.0 | solo.level5_of_5.eq | 5/3, 2/2 |
V. Time-varying wavelet-based equalization
| CD Track # | Audio example | Artist / Title | Description | Control file used |
| 77 | 5.36 | Hootie and the Blowfish /
Only Wanna Be With You | original file | time_varying.eq |
| 78 | 5.37 | " | processed file | time_varying.eq |
| 79 | 5.38 | Sting / Fields of Gold | original file | time_varying.eq |
| 80 | 5.39 | " | processed file | time_varying.eq |
| 81 | 5.40 | Killing Joke / White Out | original file | time_varying.eq |
| 82 | 5.41 | " | processed file | time_varying.eq |
| 83 | 5.42 | Murder, Inc. /
Mr. Whiskey's Name | original file | time_varying.eq |
| 84 | 5.43 | " | processed file | time_varying.eq |
VI. Delayline usage
| CD Track # | Audio example | file | description |
| N/A | N/A | delaytest2.cc | non-audio testing of class Delayline for floating point values |
| N/A | N/A | delaytest3.cc | test of class Delayline for stereo audio data |
| 85 | j.1 | delaytest4.cc | four channel chorusing example, original file (Palestrina) |
| 86 | j.2 | delaytest4.cc | four channel chorusing example. stereo mixdown of deep quad chorusing of soundfile (Palestrina) |
| 87 | j.3 | delaytest5.cc | exhibition of high-frequency attenuation of noise due to non-integral sample delay tap. original file (white noise) |
| 88 | j.4 | delaytest5.cc | exhibition of high-frequency attenuation of noise due to non-integral sample delay tap. processed file (white noise) |
| 89 | j.5 | ac_test4.cc | implementation of Perry Cook's slide flute |
| 90 | j.6 | ac_test4a.cc | implementation of Perry Cook's slide flute with time-varying parameters |
| 91 | j.7 | granular.cc | application of granular synthesis for time-stretching purposes. original file (Beethoven) |
| 92 | j.8 | granular.cc | application of granular synthesis for time-stretching purposes. processed file (Beethoven) |
| 93 | j.9 | granular.cc | application of granular synthesis for time-compression purposes. original file (Palestrina) |
| 94 | j.10 | granular.cc | application of granular synthesis for time-compression purposes. processed file (Palestrina) |
VII. Filter usage
| CD Track # | Audio example | file | description |
| 95 | k.1 | filter_test.cc | demonstration of low pass filtering on an existing soundfile. original soundfile (Beethoven) |
| 96 | k.2 | filter_test.cc | demonstration of low pass filtering on an existing soundfile. processed soundfile (Beethoven) |
| N/A | N/A | filter_test2.cc | demonstration of different types of filters (low-pass, high-pass, notch) working on real-time audio input |
| Program name | Description | Page of source code listing | Audio example (if available) |
| normal_wavelet_transform.h | contains definitions for fwt() and ifwt() , functions that perform the wavelet transform with circular convolution | 106 | |
| lifted_wavelet_transforms.h | contains definitions for flwt() , iflwt() , and interpolate() , functions that perform the interpolated, lifted transform | 110 | |
| equalize | Wavelet-based equalization program. | 140 | 5.1 - 5.43 |
| spectral_enhance | Wavelet-based spectral enhancement program. Adds high frequencies to low-sample-rate digital audio files. | 129 | 4.1 - 4.24 |
| circ_conv_wavelet_denoise | Wavelet-based denoising program that uses standard circular convolution for the analysis/resynthesis digital filters. | 124 | 3.1 - 3.8 |
| lifted_wavelet_denoise | Wavelet-based denoising program that uses lifted, interpolated wavelets. | 121 | 3.9 - 3.17 |
| ac.h | contains declaration for
class Aiff_file | 162 | |
| ac.error.cc | contains declarations for error messages used by class Aiff_file and other classes | 171 | |
| ac.cc | contains definitions for member functions of class Aiff_file | 173 | |
| ac_test | demonstrates constructors, prints soundfile information | 205 | |
| ac_test2 | demonstrates file copying with read_frames and write_frames | 206 | |
| ac_test3 | rudimentary one-zero filter with class Delayline (see appendix K) | 208 | |
| ac_test4 | implementation of Perry Cook's slide flute with class Aiff_file | 209 | j.5 |
| ac_test4a | implementation of Perry Cook's slide flute with time dependent parameters | 211 | j.6 |
| Program name | Description | Page of source code listing | Audio example (if available) |
| ac_test5 | more tests of the read_frames and write_frames functions | 213 | |
| ac_test7 | more tests of the read_frames and write_frames functions | 214 | |
| ac_test8 | transport controls demonstrations | 216 | |
| ac_test9 | more testing of transport controls | 218 | |
| ac_test10 | 4 and 8 channel i/o tests | 220 | |
| ac_test11 | testing backwards reading of files | 222 | |
| ac_test12 | testing backwards reading of files for sample bitwidths other than 16 | 223 | |
| ac3.h | contains declaration for
class WAiff_file | 231 | |
| ac3.cc | contains definition for member functions of
class WAiff_file | 236 | |
| waiff_test.cc | contains a test driver program for
class WAiff_file | 248 | |
| ac2.h | contains declaration and definition for class Audio_port | 257 | |
| audio_port_test | demonstration of floating point and integer sample I/O | 262 | |
| audio_port_test2 | example of real-time digital filtering with class Filter | 264 | |
| delayline.h | contains declaration and definition for class Delayline | 273 | |
| delaytest2 | non-audio testing of class Delayline for floating point values | 277 | |
| delaytest3 | test of class Delayline for stereo data | 278 | |
| delaytest4 | class Delayline is used to implement four-channel chorusing of a stereo soundfile | 281 | j.1, j.2 |
| delaytest5 | exhibition of high-frequency attenuation of noise due to linear interpolation | 283 | j.3, j.4 |
| granular | implementation of simple granular synthesis using multiple delay lines | 288 | j.7, j.8, j.9, j.10 |
| filter.h | contains definitions for member functions of class Filter | 295 | |
| filter_test | demonstration of low pass filtering on an existing soundfile | 298 | k.1, k.2 |
| filter_test2 | demonstration of different types of filters (low-pass, high-pass, notch) working on real-time audio input | 299 |
Go to: Title Page Chapter 1 Appendix A Appendix J Copyright Chapter 2 Appendix B Appendix K Abstract Chapter 3 Appendix C Appendix L Acknowledgements Chapter 4 Appendix D References Table of Contents Chapter 5 Appendix E List of Figure Conclusions/ Appendix F List of Tables Future Directions Appendix G List of Audio Examples Appendix H List of Programs Appendix I