Go to:
Title Page               Chapter 1           Appendix A   Appendix J
Copyright                Chapter 2           Appendix B   Appendix K
Abstract                 Chapter 3           Appendix C   Appendix L
Acknowledgements         Chapter 4           Appendix D   References
Table of Contents        Chapter 5           Appendix E
List of Figure           Conclusions/        Appendix F
List of Tables           Future Directions   Appendix G
List of Audio Examples                       Appendix H
List of Programs                             Appendix I


Chapter 2

Mathematical summary of multiresolution analysis and relevant wavelet theory



I. Introduction and Overview

The forward and inverse wavelet transform of a window of samples can be implemented using a set of upsamplers, downsamplers, and recursive two-channel digital filter banks. For the forward wavelet transform, the output of the filter banks are the wavelet coefficients at a desired level of resolution; for the inverse wavelet transform, the wavelet coefficients are the input, and the output of the final set of filter banks is the reconstructed window. Typically, the wavelet coefficients are analyzed or altered in the intermediate stage before the inverse transform is performed. Here is a diagram of the entire analysis/resynthesis process for a three level, perfect reconstruction wavelet analysis and resynthesis:


Figure 2.1

First, a window of input samples is selected which is a power of 2. The window of input samples is then processed twice in parallel, once through the high pass filter, and once through the lowpass filter. The outputs of the filters are then downsampled so that each output is exactly one half the size of the original input (every other sample of each output is omitted). The downsampled output from the first highpass filter becomes the wavelet coefficients at level N, where N is the number of levels of wavelet analysis desired (in Figure 2.1, N = 3). These coefficients contain the highest time-resolution, widest bandwidth information, and highest center frequency wavelet coefficients of the analysis. The downsampled output from the first lowpass filter serves as the input to an identical highpass/lowpass, downsampling stage. The next stage's downsampled output from the highpass filter will become the wavelet coefficients for level N-1 (N-1 = 2 in Figure 2.1). The (N-1)th level wavelet coefficients will have lower characteristic time-resolution by a factor of two, narrower characteristic bandwidth information by a factor of two, and lower characteristic center frequency by a factor of two. This process continues for the desired number of levels. The output of the final low-pass filter has the same size as the level 1 wavelet coefficients, and are referred to as the final average coefficients.

The inverse transform works in a similar manner. First, the level 1 wavelet coefficients and the final average coefficients are upsampled so that each is twice its original size (zeros are inserted between every sample of each set of coefficients). The upsampled versions of the final average and the level 1 wavelet coefficients are passed through the inverse low pass and inverse high pass filters, respectively (note that the analysis and synthesis high pass filters are different, as are the analysis and synthesis low pass filters). The outputs are added together, and this sum serves as the input to the low-pass channel for the next stage. This process continues until all of the wavelet coefficients have been upsampled, filtered, and added to the total output of previous stages' processing. The final sum is the reconstructed input.

The upsampling and downsampling depicted in Figure 2.1 is important in ensuring that the size of the output (number of total wavelet coefficients on all levels + number of samples in the final average) is the same size as the input. If the downsampling did not occur in the analysis (forward transform), there would be a "data explosion" problem similar to the one that occurs for FFT windows which severely overlap each other (Moore 106). Since the outputs of the final low pass filter and all the high pass filters are the final average and wavelet coefficients of a data set, respectively, the upsampling and downsampling operations can also be interpreted in the following way: downsampling in the analysis half of the algorithm ensures that there is an equal amount of data in the input signal wavelet domain representation. In other words, if 256 samples are in the input signal, then there will be 128 level 3 wavelet coefficients, 64 level 2 coefficients, 32 level 3 coefficients, and 32 coefficients in the final average, for a total of 256 coefficients in the wavelet representation of the 256 sample input. Another way to view the two-fold decrease in the number of wavelet coefficients for subsequent levels of analysis is that there is a change in the time resolution from one level to another. Since the wavelet coefficients on each level describe the entire window's time-frequency information, more coefficients on one level means a higher time-resolution for that level's wavelet coefficients; consequently, fewer coefficients on a lower level mean a lower time resolution for that level's wavelet coefficients.

It is important to note that the mother wavelet's shape is not being used directly in the computation of the forward and inverse wavelet transforms; the high and low pass filters, which are related to the mother wavelet, are used for all computations. In general, the high and low pass filters, which are different for the analysis and resynthesis stages, usually have cutoff frequencies near one quarter the sample rate (one half the Nyquist frequency). Therefore, while the upsampling and downsampling operations explain the increased time resolution for higher levels of analysis, the equal-bandwidth high and low pass filter structure explains the higher frequency resolution for lower levels of analysis. This can be seen as follows: the input, which contains full-bandwidth frequency information, is separated in the level N analysis into two spectral components: high frequency information (level N wavelet coefficients) and low frequency information (input to next analysis stage). Each output, then, has half the bandwidth of its input, since the cutoff frequencies of the high and low pass filters are at approximately half the spectrum. Since the low-frequency information is split again into high and low frequency information, wavelet coefficients at subsequent levels have better and better frequency resolution (lower bandwidth of information), while having worse and worse time resolution.

Although the digital filters in Figure 2.1 do have other special properties described below, their implementations in the forward and inverse transform are straightforward. Each filter has a set of filter coefficients, which are usually employed in a time-domain convolution with the input window's samples.

A common problem in the implementation of digital filters on windowed data is how to deal with the boundaries of the window. For example, when a long soundfile is divided into several 4096-sample windows, the wavelet coefficients that correspond to the first and last few samples of each 4096-sample window exhibit artifacts from the windowing. There are several methods that have been developed outside of wavelet theory that deal with this problem, including symmetric extension, zero-padding, and circular convolution (Strang 340). In this thesis, circular convolution is used for most windowing purposes since it is the most straightforward to implement. The reader is referred to standard textbooks on signal processing for details on the circular convolution scheme (Oppenheim 542).

II. Signal processing approach to wavelet analysis

Let us return to the details of the high and low pass filters associated with wavelet analysis. The high and low pass filters represented above must satisfy several mathematical constraints in order to have the defining properties associated with a multiresolution, or wavelet analysis. Most importantly, the filters must satisfy a "perfect reconstruction" constraint for the filter coefficients. Mathematically, no distortion nor aliasing should occur for a signal that enters and leaves the filter bank with no processing of wavelet coefficients. In other words, the output should be the same as the input for no processing of wavelet coefficients:


(Strang 104)

In the design of such filters, a good simplification to make that immediately satisfies the no alias requirement is to assign

(Strang 105)

This choice of filters not only automatically satisfies the no-aliasing requirement, but it also gives us "two filters for the price of one"; that is, if we design the H (analysis) filters, (2.3) will quickly give us the F(synthesis) filters.

We can place even more restrictions on the relationship between the two analysis filters to further narrow our filter design parameters. (2.4) and (2.5) each impose certain restrictions on the relationship between H0 and H1 that apply for equal length filters, where the filter coefficients of H0 are h(0), h(1), H(2)... H(N).

(Strang 109)

(2.4) is known as the alternating sign construction: if the filter coefficients of H0 are h(0), h(1), H(2)... H(N), then the filter coefficients of H1 are h(0), -h(1), h(2), -h(3)... The pair of filters that results from such a construction are referred to as QMF (Quadrature Mirror Filters) filters. As Strang says in his book, "The highpass response |H1(ej)| is a mirror image of the lowpass magnitude |H0(ej)| with respect to the middle frequency /2 - the quadrature frequency (Strang 109)."

Another choice that relates H0(z) to H1(z) is called the alternating flip construction. If the filter coefficients of H0 are h(0), h(1), H(2)... H(N), then the filter coefficients of H1 are h(N), -h(N-1), h(N-2), -h(N-3).... Mathematically, this can be expressed in the z-domain relation


(Strang, 109)

This construction produces a pair of orthogonal filters. The celebrated Daubechies filters, for example, are orthogonal filters (Daubechies). Not all filterbanks are orthogonal; however, all perfect reconstructing filterbanks are biorthogonal.

The distinction between orthogonal and biorthogonal filter banks is significant, and involves parallel concepts in linear space theory and functional analysis. Orthogonal filter banks contain filters that are mathematically "neater" than filters in biorthogonal filter banks; in wavelet analysis, an orthogonal filter bank corresponds to a set of basis functions which are all orthogonal to one another, a desirable property when constructing basis functions for any space. However, filters in an orthogonal filter bank cannot have coefficients which are symmetric (or anti-symmetric) about a central axis. In audio processing, a filter which is not symmetric (or anti-symmetric) about a central axis produces phase distortion in its output because it is not a linear phase filter (Oppenheim 254). Although phase distortion is a relatively mild form of distortion for audio, it is nonetheless desirable to eliminate as many kinds of distortion possible in the final filtering of a signal. Biorthogonal filter banks, although "messier" mathematically, allow for symmetric filter coefficients, and thus produce linear-phase, non-phase-distorting filters. Therefore, biorthogonal, symmetric filters are used exclusively in this thesis.

To summarize, we have developed a scheme for executing the forward and inverse wavelet transform using a recursive set of filters. Four filters are involved, a high pass filter and low pass filter for both the analysis and resynthesis. By relating the analysis filters with the synthesis filters through (2.1), (2.2), and (2.3), and by relating the analyzing high pass and low pass filters through (2.4) and (2.5), we are left with the design of only one filter. In order to design this filter, we leave the language of signal processing and now turn to another branch of mathematics: functional analysis.

III. Multiresolution and functional analysis approach to wavelet analysis

Multiresolution analysis is functional analysis' theoretic construct which parallels the above signal-processing-style filter bank interpretation of wavelets. Multiresolution analysis is the separation of the Hilbert space L2 into subspaces, where the subspaces and basis functions of these subspaces have special relationships to each other. Consequently, in a multiresolution analysis, functions belonging to each subspace must be composed of that subspace's basis functions, and therefore must have certain properties related to the basis functions of the subspace they span. Specifically, a biorthogonal multiresolution analysis is a division of L2 into subspaces and such that the following requirements are met:


(2.6) says that is a set of subspaces whose union over j is the complete space L2. Furthermore, (2.6) says that if a function is a member of , then it is also a member of the "larger" subspace. (2.7) establishes a "dual" set of subspaces for which the same condition is true. In this sense, both the 's and the 's each span the complete space L2. (2.8) states that if a function is in , then a two-scale "compressed" version of that function is in . (2.10) states that if a function is in , then all of its time axis translates are also members of . (2.9) and (2.11) establish a "dual" set of subspaces for which the same conditions are true.

Given the discussion above, recursion can show the following important central fact of multiresolution analysis: if a certain function is a member of , not only are the function and all of its time-axis translates also members of and, but the function's two-fold "time-compressed version" and all if its time-axis translates are members of . Therefore, an important corollary is that the basis functions of the 's must follow this rule. That is, if a certain basis function is a member of , not only are the basis function and all of its time-axis translates also members of and, but the basis function's two-fold "time-compressed version" and all if its time-axis translates are members of . Because of (2.11), the same conclusions can be said of the dual set of subspaces . (2.12) makes the existence of the basis functions formal for both the primary () and dual () subspaces.

Furthermore, (2.12) requires the two sets of bases to have a certain relationship to one another: biorthogonality. The concept of biorthogonal bases is a well-developed concept in the areas of linear algebra and functional analysis. The following theorem is important in understanding why a pair of subspaces with biorthogonal basis functions is required for multiresolution analysis:

(Strang 71)

The need for two different expressions of the same function in two different sets of subspaces ( and) must be linked by the biorthogonality constraint of the corresponding basis functions for perfect reconstruction purposes. As will be shown later, one set of basis functions is linked with the analysis of a function, while another set of basis functions is linked with the resynthesis of the function. The biorthogonality constraint guarantees that the entire analysis and resynthesis will produce a perfect reconstruction. The biorthogonality constraint on the basis functions will eventually lead to the development of biorthogonal filter banks as discussed in II. above. The reader will recall that biorthogonality of the filters is required for symmetric filters which are linear in phase response and therefore produce no phase distortion for audio input signals.

Once these two sets of subspaces have been established, wavelet function subspaces enter the discussion as the difference subspace between different (and). Since contains the translates of two-fold time-axis compressed basis functions of , and since any function that is in is also in, it is reasonable to say that contains more "detailed" basis functions than ; that is, the projection of an arbitrary L2 function onto will contain more "details" than the projection of that same function onto because of the added detail possible in . Since any function that is in is also in, there must be a set, or subspace of functions which are contained in but are not contained in . Let this subspace be , and let the basis functions that span this subspace be called the wavelets basis functions . Since the wavelet basis functions, which span are also functions in , the inherit many important properties from the basis functions in . Significantly, the properties of scale and shift invariance stated in (2.8) - (2.11) also apply to the wavelet basis functions . That is, if a certain wavelet basis function is a member of , not only are the wavelet basis function and all of its time-axis translates also members of and, but the wavelet basis function's two-fold "time-compressed version" and all if its time-axis translates are members of .

In addition, the following properties relate the 's and 's, show how the 's can be used to compute the 's, and how and the 's can be used to compute the's (for j_0).





For example, consider an arbitrary function f that lies in L2. . Since the closure of the union of the 's is complete in L2., and since the closure of the union of the 's is complete in L2., we can construct successive approximations to f which are increasingly detailed; these approximations will be contained in the increasingly more detailed subspaces and . Now we can see the use of (2.13) and (2.14). In order to do this decomposition of f into linear combinations of the basis functions (which span), we need to use the biorthogonal relationship between and . That is, in order to compute the contribution of a certain set of 's to the approximation to f in subspace , we need to take the dot product of f with the corresponding set of 's which correspond to the dual subspace . Once we have found this set of successive approximations to f, we can use (2.15) to find how two successive approximations in and differ from each other. This difference function is contained in the subspace , and is spanned by a certain set of wavelet functions.

IV. Relationship of processing theory to multiresolution and functional analysis; Derivation of the forward and inverse fast wavelet transforms

All that remains is to show how this theory relates back to the filterbank scheme depicted in Figure 2.1. In fact, the filter bank scheme in Figure 2.1 exactly performs this analysis and resynthesis routine in actual implementations of the wavelet transform. The decomposition process just discussed is exactly analogous to the analysis stage of the filter bank scheme depicted in Figure 2.1. The projection of f onto is analogous to the "final average" in Figure 2.1, the projection of f onto is analogous to the level 3 wavelet coefficients, the projection of f onto is analogous to the level 2 wavelet coefficients, and the projection of f onto is analogous to the level 1 wavelet coefficients (Please note the inverse relationship between the indices of the 's and the wavelet levels).

To find the connection between filterbanks and multiresolution analysis, let us ask the following question: What exactly are the basis functions which span the subspaces and , and how do we compute them numerically? To begin, we can summarize (2.6)-(2.12) in a few, concise quantitative statements that relate the basis functions Scale invariance, shift invariance, and the nested property of the subspaces such that generate the following set of relations:



(2.19) - (2.22) are known as dilation equations, two-scale difference equations, or refinement equations. All four equations simply express a change of basis from one subspace to another. In other words, these relations relate the basis functions of the subspaces (and the subspaces) by expressing one set of basis functions in terms of a linear combination of another set of basis functions. The constants h0, h1, f0, and f1 are the coefficients that describe the exact linear transformation that is required to achieve this change of basis.

In fact, these changes of bases, which involve factors of 2 along the time axis, can be used to write the central recursive relationship between the decomposition of an arbitrary L2. function into the subspaces .

By (2.12), has basis functions . Now, let



We find by taking the L2 inner (dot) product with the dual (biorthogonal) basis functions , as described in (2.13):





Now, recall the change of basis property of the basis functions, (2.21), to write







However, the term in the last integral of (2.27) is just an expression for the basis functions of . Therefore, when the integral, or inner product in (2.27) is evaluated, the and terms will combine to reduce to a term of inside the summation sign. The important fact from this reasoning is that one can derive the 's from the 's recursively:



By similar reasoning, the same relationship is true for 's; that is, they can be derived recursively from the scaling coefficients one level above:



Note the presence of the "change of basis" coefficients in (2.28) and (2.29). Now, recall that (2.19) - (2.22) relate any two sets of primary/dual scaling/wavelet basis functions on adjacent levels. Therefore, (2.27) can be written in such a way that it is a general relationship between two adjacent levels, not only just levels zero and one. In other words, in general, the wavelet and scaling coefficients can be computed directly from certain "change of basis coefficients" and the scaling coefficients from one level above. This is the key recursion in the computation of the wavelet transform. Together, they are called the fast wavelet transform:




The summations in (2.30) are familiar; they are downsampled convolutions, with filter coefficients . Therefore, analysis of (2.30) provides the link to the filter bank theory that was introduced at the beginning of this chapter. Not only are "change of basis coefficients" as introduced in (2.19) - (2.22), but they are also digital filter coefficients. The downsampled output from filters will produce the scaling coefficients and wavelet coefficients , respectively, if both the filters are presented with the scaling coefficients as input. (2.30) can be interpreted as a recursive relationship, and consequently any level of wavelet and/or scaling coefficients can be computed if the scaling coefficients are given, and are taken as the starting points for the recursion. Then the become the "level N-n"; "level N-(n-1)"; "level N-(n-2)"... wavelet coefficients in Figure 2.1, and the final computation of the becomes the "final average" in Figure 2.1. Typically, the signal to be analyzed with wavelet analysis is windowed, and each window of raw input becomes the inputs to the recursive filter bank scheme. Mathematically, this arbitrary assignment of sample values to is not theoretically supported; in fact, Strang even calls it a "wavelet crime (Strang 232)." Nonetheless, it is convenient to take the input samples as the starting point in the recursive scheme, and it is possibly the most common starting point for wavelet analysis. In this thesis, therefore, the input samples are indeed windowed, and are assigned to be the scaling coefficients .

The inverse wavelet transform is similar to the forward transform just discussed. Similar reasoning can show that a relationship resembling (2.30) exists that parallels the second half of the filter bank scheme in Figure 2.1. The only thing left to explain is why the outputs of the filters in the second half of Figure 2.1 add before entering the next stage of the reconstruction.

(2.15) and (2.16) show us that there are two expressions for any function in the subspace : we can expand in terms of the basis functions in , or expand in terms of the basis functions of . Together with (2.19) - (2.22), we can make the following relationship:




The addition sign in (2.31) is significant: if a similar analysis is performed on (2.31) as is performed on (2.19) - (2.22) through (2.23) - (2.30), then the addition sign can be traced to the following result. This result is analogous to (2.30), and is called the inverse fast wavelet transform:





Now we can show that the are the four filters in Figure 2.1. Recall the restrictions we placed on the filters in (2.1) - (2.5). By imposing two sets of restrictions on interrelationships of these filters (one which relates the analysis and synthesis high and low pass filters to each other, one which relates the high and low pass analysis filters to each other), we were left with the design of only one filter from which the other three could be easily computed in terms of filter coefficient order reversal and/or filter coefficient sign negation. Recall also the dilation equations (2.19) - (2.22). If we could solve just (2.21) for the 's and , for example, we could compute the other filters.

V. More properties of analysis and synthesis filters: zeros at , vanishing moments

One more point deserves mention in the construction of the analysis and resynthesis filters . Qualitatively, the wavelet coefficients represent how different one approximation of a decomposed function is to the next detailed approximation of the function. Therefore, if a function is "simple" enough, and the wavelet analysis is "good," then the wavelet representation of this function should have few, relatively small-valued wavelet coefficients.

Specifically, if a function is "simple" enough, and the analysis is "good" enough, then the detail afforded by deeper levels of wavelet analysis won't matter, since after a certain level the differences represented by higher levels of wavelet coefficients, and therefore the values of the wavelet coefficients themselves past a certain level, will be negligible. What parameters do "simple" functions and "good" wavelet analyses refer to, and are there parameters we can control to achieve "good" analyses as often as possible?

Developments in wavelet theory have established that an extra requirement on the low-pass filters can make "good" analyses of certain "simple" functions known as polynomials. Specifically, in addition to the requirements imposed on the filters by the dilation equations (2.19) - (2.22) and the high/low/analysis/synthesis interrelationships in (2.1) - (2.5), low-pass filters that have p zeros at can perfectly analyze and reproduce a polynomial of degree p-1. The more zeros each have at , the "better" the analysis is. In other words, using low pass filters of degree p-1, if any portion of any level scaling coefficients is a polynomial of degree p-1, then the next level's scaling coefficients corresponding to that same portion of the signal will be exactly zero. In general, therefore, wavelet analyses for which the low pass filters have zeros at are best suited to analyzing polynomials.


(Strang 227)


Most often, however, wavelet analysis is applied to natural signals, which are seldom perfect polynomials. Nonetheless, in practice, using a lowpass filter with p zeros at means that the higher-level wavelet coefficients resulting from the analysis will approach zero faster for a higher value of p, for piecewise-smooth polynomials. This is desirable, since we hope that quickly decaying wavelet coefficients will produce a signal representation which has few non-zero wavelet coefficients that die away quickly across levels, thus producing more concentrated and easily separated features in the wavelet representation of the input signal. In practical terms, a lowpass filter with more zeros at can exactly analyze a signal that is piecewise polynomial of a higher degree.

VI. Derivation of the analysis/resynthesis filters used in this thesis

The final task of this chapter is to derive actual filter coefficients for use in this thesis. As stated above, finding specific filters that produce wavelet coefficients from an original signal involves solving one of the dilation equations (2.19), (2.20), (2.21), or (2.22). However, solving the dilation equation involves extensive linear algebra which is beyond the scope of this thesis. Suffice it to say that most solution methods involve recursive techniques due to the recursive nature of the dilation equation itself.

There are many wavelet families whose analyzing and resynthesizing filter coefficients have already been derived, such as the Haar and Daubechies filters. However, as stated above, we require a filter whose coefficients are symmetric with respect to a central axis, so that no phase distortion results from the filtering. Therefore, the wavelets and filters used in this thesis come from a family known as biorthogonal, symmetric, binary coefficient filters. These wavelets and filters were chosen because of their symmetry and simplicity of implementation. Specifically, a binary filter is a filter whose coefficients are integers divided by a power of two. Binary filters are significant not only for their simplicity, but also for their efficiency in computer-oriented integer-only computations, where division by a power of two is often quicker than a floating point division (Strang 216). Tables 2.1 lists coefficients for the high and low pass filters which are used in this thesis, along with the Haar coefficients and the coefficients for the Daubechies "D4" orthonormal filters. In deriving these filters, the filter was computed by solving the dilation equation (2.21), and then (2.1) - (2.5) were used to find coefficients of the other filters. The number of zeros at for each filter are listed in Table 2.2. Also note that the analysis filters can be interchanged with the synthesis filters, since the same dilation equation used to derive can also be used to derive .



Name
h0

(analysis low-pass)
h1

(analysis high pass)
f0

(resynthesis low-pass)
f1

(resynthesis low-pass)
Haar
Dau-bechies D4
binary 3/5
binary 5/3
binary 6/10
binary 10/6


Table 2.1








Name
zeros at

h0

(analysis low-pass)
zeros at 0

h1

(analysis high pass)
zeros at

f0

(resynthesis low-pass)
zeros at 0

f1

(resynthesis low-pass)
Haar
1
1
1
1
Daubechies D4
2
2
2
2
binary 3/5
2
2
2
2
binary 5/3
2
2
2
2
binary 6/10
3
3
3
3
binary 10/6
3
3
3
3

Table 2.2

Go to:
Title Page               Chapter 1           Appendix A   Appendix J
Copyright                Chapter 2           Appendix B   Appendix K
Abstract                 Chapter 3           Appendix C   Appendix L
Acknowledgements         Chapter 4           Appendix D   References
Table of Contents        Chapter 5           Appendix E
List of Figure           Conclusions/        Appendix F
List of Tables           Future Directions   Appendix G
List of Audio Examples                       Appendix H
List of Programs                             Appendix I