Go to: Title Page Chapter 1 Appendix A Appendix J Copyright Chapter 2 Appendix B Appendix K Abstract Chapter 3 Appendix C Appendix L Acknowledgements Chapter 4 Appendix D References Table of Contents Chapter 5 Appendix E List of Figure Conclusions/ Appendix F List of Tables Future Directions Appendix G List of Audio Examples Appendix H List of Programs Appendix I
I. Introduction and Overview
The forward and inverse wavelet transform of a window of samples
can be implemented using a set of upsamplers, downsamplers, and
recursive two-channel digital filter banks. For the forward wavelet
transform, the output of the filter banks are the wavelet coefficients
at a desired level of resolution; for the inverse wavelet transform,
the wavelet coefficients are the input, and the output of the
final set of filter banks is the reconstructed window. Typically,
the wavelet coefficients are analyzed or altered in the intermediate
stage before the inverse transform is performed. Here is a diagram
of the entire analysis/resynthesis process for a three level,
perfect reconstruction wavelet analysis and resynthesis:
First, a window of input samples is selected which is a power of 2. The window of input samples is then processed twice in parallel, once through the high pass filter, and once through the lowpass filter. The outputs of the filters are then downsampled so that each output is exactly one half the size of the original input (every other sample of each output is omitted). The downsampled output from the first highpass filter becomes the wavelet coefficients at level N, where N is the number of levels of wavelet analysis desired (in Figure 2.1, N = 3). These coefficients contain the highest time-resolution, widest bandwidth information, and highest center frequency wavelet coefficients of the analysis. The downsampled output from the first lowpass filter serves as the input to an identical highpass/lowpass, downsampling stage. The next stage's downsampled output from the highpass filter will become the wavelet coefficients for level N-1 (N-1 = 2 in Figure 2.1). The (N-1)th level wavelet coefficients will have lower characteristic time-resolution by a factor of two, narrower characteristic bandwidth information by a factor of two, and lower characteristic center frequency by a factor of two. This process continues for the desired number of levels. The output of the final low-pass filter has the same size as the level 1 wavelet coefficients, and are referred to as the final average coefficients.
The inverse transform works in a similar manner. First, the level 1 wavelet coefficients and the final average coefficients are upsampled so that each is twice its original size (zeros are inserted between every sample of each set of coefficients). The upsampled versions of the final average and the level 1 wavelet coefficients are passed through the inverse low pass and inverse high pass filters, respectively (note that the analysis and synthesis high pass filters are different, as are the analysis and synthesis low pass filters). The outputs are added together, and this sum serves as the input to the low-pass channel for the next stage. This process continues until all of the wavelet coefficients have been upsampled, filtered, and added to the total output of previous stages' processing. The final sum is the reconstructed input.
The upsampling and downsampling depicted in Figure 2.1 is important in ensuring that the size of the output (number of total wavelet coefficients on all levels + number of samples in the final average) is the same size as the input. If the downsampling did not occur in the analysis (forward transform), there would be a "data explosion" problem similar to the one that occurs for FFT windows which severely overlap each other (Moore 106). Since the outputs of the final low pass filter and all the high pass filters are the final average and wavelet coefficients of a data set, respectively, the upsampling and downsampling operations can also be interpreted in the following way: downsampling in the analysis half of the algorithm ensures that there is an equal amount of data in the input signal wavelet domain representation. In other words, if 256 samples are in the input signal, then there will be 128 level 3 wavelet coefficients, 64 level 2 coefficients, 32 level 3 coefficients, and 32 coefficients in the final average, for a total of 256 coefficients in the wavelet representation of the 256 sample input. Another way to view the two-fold decrease in the number of wavelet coefficients for subsequent levels of analysis is that there is a change in the time resolution from one level to another. Since the wavelet coefficients on each level describe the entire window's time-frequency information, more coefficients on one level means a higher time-resolution for that level's wavelet coefficients; consequently, fewer coefficients on a lower level mean a lower time resolution for that level's wavelet coefficients.
It is important to note that the mother wavelet's shape is not being used directly in the computation of the forward and inverse wavelet transforms; the high and low pass filters, which are related to the mother wavelet, are used for all computations. In general, the high and low pass filters, which are different for the analysis and resynthesis stages, usually have cutoff frequencies near one quarter the sample rate (one half the Nyquist frequency). Therefore, while the upsampling and downsampling operations explain the increased time resolution for higher levels of analysis, the equal-bandwidth high and low pass filter structure explains the higher frequency resolution for lower levels of analysis. This can be seen as follows: the input, which contains full-bandwidth frequency information, is separated in the level N analysis into two spectral components: high frequency information (level N wavelet coefficients) and low frequency information (input to next analysis stage). Each output, then, has half the bandwidth of its input, since the cutoff frequencies of the high and low pass filters are at approximately half the spectrum. Since the low-frequency information is split again into high and low frequency information, wavelet coefficients at subsequent levels have better and better frequency resolution (lower bandwidth of information), while having worse and worse time resolution.
Although the digital filters in Figure 2.1 do have other special properties described below, their implementations in the forward and inverse transform are straightforward. Each filter has a set of filter coefficients, which are usually employed in a time-domain convolution with the input window's samples.
A common problem in the implementation of digital filters on
windowed data is how to deal with the boundaries of the window.
For example, when a long soundfile is divided into several 4096-sample
windows, the wavelet coefficients that correspond to the first
and last few samples of each 4096-sample window exhibit artifacts
from the windowing. There are several methods that have been developed
outside of wavelet theory that deal with this problem, including
symmetric extension, zero-padding, and circular convolution (Strang
340). In this thesis, circular convolution is used for most windowing
purposes since it is the most straightforward to implement. The
reader is referred to standard textbooks on signal processing
for details on the circular convolution scheme (Oppenheim 542).
II. Signal processing approach to wavelet analysis
Let us return to the details of the high and low pass filters
associated with wavelet analysis. The high and low pass filters
represented above must satisfy several mathematical constraints
in order to have the defining properties associated with a multiresolution,
or wavelet analysis. Most importantly, the filters must satisfy
a "perfect reconstruction" constraint for the filter
coefficients. Mathematically, no distortion nor aliasing should
occur for a signal that enters and leaves the filter bank with
no processing of wavelet coefficients. In other words, the output
should be the same as the input for no processing of wavelet
coefficients:
(Strang 104)
In the design of such filters, a good simplification to make
that immediately satisfies the no alias requirement is to assign
(Strang 105)
This choice of filters not only automatically satisfies the no-aliasing requirement, but it also gives us "two filters for the price of one"; that is, if we design the H (analysis) filters, (2.3) will quickly give us the F(synthesis) filters.
We can place even more restrictions on the relationship between
the two analysis filters to further narrow our filter design parameters.
(2.4) and (2.5) each impose certain restrictions on the relationship
between H0 and H1 that apply for equal length
filters, where the filter coefficients of H0 are h(0),
h(1), H(2)... H(N).
(Strang 109)
(2.4) is known as the alternating sign construction: if the filter coefficients of H0 are h(0), h(1), H(2)... H(N), then the filter coefficients of H1 are h(0), -h(1), h(2), -h(3)... The pair of filters that results from such a construction are referred to as QMF (Quadrature Mirror Filters) filters. As Strang says in his book, "The highpass response |H1(ej)| is a mirror image of the lowpass magnitude |H0(ej)| with respect to the middle frequency /2 - the quadrature frequency (Strang 109)."
Another choice that relates H0(z) to H1(z)
is called the alternating flip construction. If the filter coefficients
of H0 are h(0), h(1), H(2)... H(N), then the filter
coefficients of H1 are h(N), -h(N-1), h(N-2), -h(N-3)....
Mathematically, this can be expressed in the z-domain relation
(Strang, 109)
This construction produces a pair of orthogonal filters. The celebrated Daubechies filters, for example, are orthogonal filters (Daubechies). Not all filterbanks are orthogonal; however, all perfect reconstructing filterbanks are biorthogonal.
The distinction between orthogonal and biorthogonal filter banks is significant, and involves parallel concepts in linear space theory and functional analysis. Orthogonal filter banks contain filters that are mathematically "neater" than filters in biorthogonal filter banks; in wavelet analysis, an orthogonal filter bank corresponds to a set of basis functions which are all orthogonal to one another, a desirable property when constructing basis functions for any space. However, filters in an orthogonal filter bank cannot have coefficients which are symmetric (or anti-symmetric) about a central axis. In audio processing, a filter which is not symmetric (or anti-symmetric) about a central axis produces phase distortion in its output because it is not a linear phase filter (Oppenheim 254). Although phase distortion is a relatively mild form of distortion for audio, it is nonetheless desirable to eliminate as many kinds of distortion possible in the final filtering of a signal. Biorthogonal filter banks, although "messier" mathematically, allow for symmetric filter coefficients, and thus produce linear-phase, non-phase-distorting filters. Therefore, biorthogonal, symmetric filters are used exclusively in this thesis.
To summarize, we have developed a scheme for executing the forward
and inverse wavelet transform using a recursive set of filters.
Four filters are involved, a high pass filter and low pass filter
for both the analysis and resynthesis. By relating the analysis
filters with the synthesis filters through (2.1), (2.2), and
(2.3), and by relating the analyzing high pass and low pass filters
through (2.4) and (2.5), we are left with the design of only one
filter. In order to design this filter, we leave the language
of signal processing and now turn to another branch of mathematics:
functional analysis.
III. Multiresolution and functional analysis approach to wavelet analysis
Multiresolution analysis is functional analysis' theoretic
construct which parallels the above signal-processing-style filter
bank interpretation of wavelets. Multiresolution analysis is the
separation of the Hilbert space L2
into subspaces, where the subspaces and basis functions
of these subspaces have special relationships to each other. Consequently,
in a multiresolution analysis, functions belonging to each subspace
must be composed of that subspace's basis functions, and
therefore must have certain properties related to the basis functions
of the subspace they span. Specifically, a biorthogonal
multiresolution analysis is a division of
L2 into subspaces
and
such that the following requirements are
met:

(2.6) says that
is a set of subspaces
whose union over j is the complete space L2.
Furthermore, (2.6) says that if a function is a member of
,
then it is also a member of the "larger" subspace
.
(2.7) establishes a "dual" set of subspaces
for which the same condition is true. In this sense, both the
's and the
's
each span the complete space L2.
(2.8) states that if a function is in
,
then a two-scale "compressed" version of that function
is in
. (2.10) states that if a function
is in
, then all of its time axis translates
are also members of
. (2.9) and (2.11)
establish a "dual" set of subspaces
for which the same conditions are true.
Given the discussion above, recursion can show the following
important central fact of multiresolution analysis: if a certain
function is a member of
, not only are
the function and all of its time-axis translates also members
of
and
, but the
function's two-fold "time-compressed version"
and all if its time-axis translates are members of
.
Therefore, an important corollary is that the basis functions
of the
's must follow this rule.
That is, if a certain basis function is a member of
,
not only are the basis function and all of its time-axis translates
also members of
and
,
but the basis function's two-fold "time-compressed
version" and all if its time-axis translates are members
of
. Because of (2.11), the same conclusions
can be said of the dual set of subspaces
.
(2.12) makes the existence of the basis functions formal for both
the primary (
) and dual (
)
subspaces.
Furthermore, (2.12) requires the two sets of bases to have a
certain relationship to one another: biorthogonality. The concept
of biorthogonal bases is a well-developed concept in the areas
of linear algebra and functional analysis. The following theorem
is important in understanding why a pair of subspaces with
biorthogonal basis functions is required for multiresolution analysis:
(Strang 71)
The need for two different expressions of the same function in
two different sets of subspaces (
and
)
must be linked by the biorthogonality constraint of the corresponding
basis functions for perfect reconstruction purposes. As will be
shown later, one set of basis functions is linked with the analysis
of a function, while another set of basis functions is linked
with the resynthesis of the function. The biorthogonality constraint
guarantees that the entire analysis and resynthesis will produce
a perfect reconstruction. The biorthogonality constraint on the
basis functions will eventually lead to the development of biorthogonal
filter banks as discussed in II. above. The reader will recall
that biorthogonality of the filters is required for symmetric
filters which are linear in phase response and therefore produce
no phase distortion for audio input signals.
Once these two sets of subspaces have been established, wavelet
function subspaces enter the discussion as the difference subspace
between different
(and
).
Since
contains the translates of two-fold
time-axis compressed basis functions of
,
and since any function that is in
is also
in
, it is reasonable to say that
contains more "detailed" basis functions than
;
that is, the projection of an arbitrary L2
function onto
will contain more
"details" than the projection of that same function
onto
because of the added detail possible
in
. Since any function that is in
is also in
, there must be a set, or subspace
of functions which are contained in
but
are not contained in
. Let this subspace
be
, and let the basis functions that span
this subspace be called the wavelets basis functions
.
Since the wavelet basis functions
, which
span
are also functions in
,
the
inherit many important properties
from the basis functions in
. Significantly,
the properties of scale and shift invariance stated in (2.8) -
(2.11) also apply to the wavelet basis functions
.
That is, if a certain wavelet basis function is a member of
,
not only are the wavelet basis function and all of its time-axis
translates also members of
and
,
but the wavelet basis function's two-fold "time-compressed
version" and all if its time-axis translates are members
of
.
In addition, the following properties relate the
's
and
's, show how the
's
can be used to compute the
's, and
how
and the
's
can be used to compute the
's (for
j_0).

For example, consider an arbitrary function f that lies
in L2. . Since
the closure of the union of the
's
is complete in L2.,
and since the closure of the union of the
's
is complete in L2.,
we can construct successive approximations to f which
are increasingly detailed; these approximations will be contained
in the increasingly more detailed subspaces
and
. Now we can see the use of (2.13)
and (2.14). In order to do this decomposition of f into
linear combinations of the basis functions
(which span
), we need to use the biorthogonal
relationship between
and
.
That is, in order to compute the contribution of a certain set
of
's to the approximation to f
in subspace
, we need to take the dot product
of f with the corresponding set of
's
which correspond to the dual subspace
.
Once we have found this set of successive approximations to f,
we can use (2.15) to find how two successive approximations in
and
differ from
each other. This difference function is contained in the subspace
, and is spanned by a certain set of wavelet
functions.
IV. Relationship of processing theory to multiresolution and functional analysis; Derivation of the forward and inverse fast wavelet transforms
All that remains is to show how this theory relates back to the
filterbank scheme depicted in Figure 2.1. In fact, the filter
bank scheme in Figure 2.1 exactly performs this analysis and resynthesis
routine in actual implementations of the wavelet transform. The
decomposition process just discussed is exactly analogous to the
analysis stage of the filter bank scheme depicted in Figure 2.1.
The projection of f onto
is analogous
to the "final average" in Figure 2.1, the projection
of f onto
is analogous to the level
3 wavelet coefficients, the projection of f onto
is analogous to the level 2 wavelet coefficients, and the projection
of f onto
is analogous to the level 1
wavelet coefficients (Please note the inverse relationship between
the indices of the
's and the wavelet
levels).
To find the connection between filterbanks and multiresolution
analysis, let us ask the following question: What exactly are
the basis functions which span the subspaces
and
, and how do we compute them numerically?
To begin, we can summarize (2.6)-(2.12) in a few, concise quantitative
statements that relate the basis functions
Scale invariance, shift invariance, and the nested property of
the subspaces such that
generate the following
set of relations:
(2.19) - (2.22) are known as dilation equations, two-scale difference
equations, or refinement equations. All four equations simply
express a change of basis from one subspace to another. In other
words, these relations relate the basis functions of the
subspaces (and the
subspaces) by expressing
one set of basis functions in terms of a linear combination of
another set of basis functions. The constants h0,
h1, f0, and f1
are the coefficients that describe the exact linear transformation
that is required to achieve this change of basis.
In fact, these changes of bases, which involve factors of 2 along
the time axis, can be used to write the central recursive relationship
between the decomposition of an arbitrary L2.
function into the subspaces
.
By (2.12),
has basis functions
.
Now, let

We find
by taking the L2
inner (dot) product with the dual (biorthogonal) basis functions
, as described in (2.13):

Now, recall the change of basis property of the basis functions,
(2.21), to write



However, the
term in the last integral
of (2.27) is just an expression for the basis functions of
.
Therefore, when the integral, or inner product in (2.27) is evaluated,
the
and
terms
will combine to reduce to a term of
inside
the summation sign. The important fact from this reasoning is
that one can derive the
's from
the
's recursively:

By similar reasoning, the same relationship is true for
's;
that is, they can be derived recursively from the scaling coefficients
one level above:

Note the presence of the "change of basis" coefficients
in (2.28) and (2.29). Now, recall that (2.19) - (2.22) relate
any two sets of primary/dual scaling/wavelet basis functions on
adjacent levels. Therefore, (2.27) can be written in such a way
that it is a general relationship between two adjacent
levels, not only just levels zero and one. In other words, in
general, the wavelet and scaling coefficients
can be computed directly from certain "change of basis
coefficients"
and the scaling coefficients
from one level above. This is the key recursion in the computation
of the wavelet transform. Together, they are called the fast
wavelet transform:

The summations in (2.30) are familiar; they are downsampled convolutions,
with filter coefficients
. Therefore,
analysis of (2.30) provides the link to the filter bank theory
that was introduced at the beginning of this chapter. Not only
are
"change of basis coefficients"
as introduced in (2.19) - (2.22), but they are also digital filter
coefficients. The downsampled output from filters
will produce the scaling coefficients
and wavelet coefficients
, respectively,
if both the filters are presented with the scaling coefficients
as input. (2.30) can be interpreted as
a recursive relationship, and consequently any level of wavelet
and/or scaling coefficients can be computed if the scaling coefficients
are given, and are taken as the starting
points for the recursion. Then the
become
the "level N-n"; "level N-(n-1)"; "level
N-(n-2)"... wavelet coefficients in Figure 2.1, and the
final computation of the
becomes the
"final average" in Figure 2.1. Typically, the signal
to be analyzed with wavelet analysis is windowed, and each window
of raw input becomes the inputs
to the
recursive filter bank scheme. Mathematically, this arbitrary assignment
of sample values to
is not theoretically
supported; in fact, Strang even calls it a "wavelet crime
(Strang 232)." Nonetheless, it is convenient to take the
input samples as the starting point in the recursive scheme, and
it is possibly the most common starting point for wavelet analysis.
In this thesis, therefore, the input samples are indeed windowed,
and are assigned to be the scaling coefficients
.
The inverse wavelet transform is similar to the forward transform just discussed. Similar reasoning can show that a relationship resembling (2.30) exists that parallels the second half of the filter bank scheme in Figure 2.1. The only thing left to explain is why the outputs of the filters in the second half of Figure 2.1 add before entering the next stage of the reconstruction.
(2.15) and (2.16) show us that there are two expressions for
any function in the subspace
: we can
expand in terms of the basis functions in
,
or expand in terms of the basis functions of
.
Together with (2.19) - (2.22), we can make the following relationship:

The addition sign in (2.31) is significant: if a similar analysis
is performed on (2.31) as is performed on (2.19) - (2.22) through
(2.23) - (2.30), then the addition sign can be traced to the following
result. This result is analogous to (2.30), and is called the
inverse fast wavelet transform:

Now we can show that the
are the four
filters in Figure 2.1. Recall the restrictions we placed on the
filters in (2.1) - (2.5). By imposing two sets of restrictions
on interrelationships of these filters (one which relates the
analysis and synthesis high and low pass filters to each other,
one which relates the high and low pass analysis filters to each
other), we were left with the design of only one filter from which
the other three could be easily computed in terms of filter coefficient
order reversal and/or filter coefficient sign negation. Recall
also the dilation equations (2.19) - (2.22). If we could solve
just (2.21) for the
's and
,
for example, we could compute the other filters.
V. More properties of analysis and synthesis filters: zeros at , vanishing moments
One more point deserves mention in the construction of the analysis
and resynthesis filters
. Qualitatively,
the wavelet coefficients
represent how
different one approximation of a decomposed function is to the
next detailed approximation of the function. Therefore, if a function
is "simple" enough, and the wavelet analysis is "good,"
then the wavelet representation of this function should have few,
relatively small-valued wavelet coefficients.
Specifically, if a function is "simple" enough, and the analysis is "good" enough, then the detail afforded by deeper levels of wavelet analysis won't matter, since after a certain level the differences represented by higher levels of wavelet coefficients, and therefore the values of the wavelet coefficients themselves past a certain level, will be negligible. What parameters do "simple" functions and "good" wavelet analyses refer to, and are there parameters we can control to achieve "good" analyses as often as possible?
Developments in wavelet theory have established that an extra
requirement on the low-pass filters
can
make "good" analyses of certain "simple"
functions known as polynomials. Specifically, in addition to the
requirements imposed on the filters by the dilation equations
(2.19) - (2.22) and the high/low/analysis/synthesis interrelationships
in (2.1) - (2.5), low-pass filters that have p zeros at
can perfectly analyze and reproduce a polynomial of degree p-1.
The more zeros
each have at , the "better"
the analysis is. In other words, using low pass filters of degree
p-1, if any portion of any level scaling coefficients
is a polynomial of degree p-1, then the next level's scaling
coefficients
corresponding to that same
portion of the signal will be exactly zero. In general, therefore,
wavelet analyses for which the low pass filters have zeros at
are best suited to analyzing polynomials.
(Strang 227)
Most often, however, wavelet analysis is applied to natural signals,
which are seldom perfect polynomials. Nonetheless, in practice,
using a lowpass filter with p zeros at means that the
higher-level wavelet coefficients resulting from the analysis
will approach zero faster for a higher value of p, for
piecewise-smooth polynomials. This is desirable, since we hope
that quickly decaying wavelet coefficients will produce a signal
representation which has few non-zero wavelet coefficients that
die away quickly across levels, thus producing more concentrated
and easily separated features in the wavelet representation of
the input signal. In practical terms, a lowpass filter with more
zeros at can exactly analyze a signal that is piecewise polynomial
of a higher degree.
VI. Derivation of the analysis/resynthesis filters used in this thesis
The final task of this chapter is to derive actual filter coefficients for use in this thesis. As stated above, finding specific filters that produce wavelet coefficients from an original signal involves solving one of the dilation equations (2.19), (2.20), (2.21), or (2.22). However, solving the dilation equation involves extensive linear algebra which is beyond the scope of this thesis. Suffice it to say that most solution methods involve recursive techniques due to the recursive nature of the dilation equation itself.
There are many wavelet families whose analyzing and resynthesizing
filter coefficients have already been derived, such as the Haar
and Daubechies filters. However, as stated above, we require a
filter whose coefficients are symmetric with respect to a central
axis, so that no phase distortion results from the filtering.
Therefore, the wavelets and filters used in this thesis come from
a family known as biorthogonal, symmetric, binary coefficient
filters. These wavelets and filters were chosen because of their
symmetry and simplicity of implementation. Specifically, a binary
filter is a filter whose coefficients are integers divided by
a power of two. Binary filters are significant not only for their
simplicity, but also for their efficiency in computer-oriented
integer-only computations, where division by a power of two is
often quicker than a floating point division (Strang 216). Tables
2.1 lists coefficients for the high and low pass filters which
are used in this thesis, along with the Haar coefficients and
the coefficients for the Daubechies "D4" orthonormal
filters. In deriving these filters, the
filter was computed by solving the dilation equation (2.21), and
then (2.1) - (2.5) were used to find coefficients of the other
filters. The number of zeros at for each filter are listed in
Table 2.2. Also note that the analysis filters can be interchanged
with the synthesis filters, since the same dilation equation used
to derive
can also be used to derive
.
|
|
|
| |
![]() | ![]() |
![]() | ![]() | |
![]() | ![]() |
![]() | ![]() | |
![]() | ![]() |
![]() | ![]() | |
![]() | ![]() |
![]() | ![]() | |
![]() | ![]() |
![]() | ![]() | |
![]() | ![]() |
![]() | ![]() |
|
|
|
| |
Go to: Title Page Chapter 1 Appendix A Appendix J Copyright Chapter 2 Appendix B Appendix K Abstract Chapter 3 Appendix C Appendix L Acknowledgements Chapter 4 Appendix D References Table of Contents Chapter 5 Appendix E List of Figure Conclusions/ Appendix F List of Tables Future Directions Appendix G List of Audio Examples Appendix H List of Programs Appendix I