Next: 7.4 Discrete Wavelet Transforms Up: 7. Transforms in Statistics Previous: 7.2 Fourier and Related

Subsections

# 7.3 Wavelets and Other Multiscale Transforms

Given their recent popularity and clear evidence of wide applicability the most of the space in this chapter is devoted to Wavelet transforms. Statistical multiscale modeling has, in recent decade, become a well established area in both theoretical and applied statistics, with impact to developments in statistical methodology.

Wavelet-based methods are important in statistics in areas such as regression, density and function estimation, factor analysis, modeling and forecasting in time series analysis, in assessing self-similarity and fractality in data, in spatial statistics.

The attention of the statistical community was attracted in late 1980's and early 1990's, when Donoho, Johnstone, and their coauthors demonstrated that wavelet thresholding, a simple denoising procedure, had desirable statistical optimality properties. Since then, wavelets have proved useful in many statistical disciplines, notably in nonparametric statistics and time series analysis. Bayesian concepts and modeling approaches have, more recently, been identified as providing promising contexts for wavelet-based denoising applications.

In addition to replacing traditional orthonormal bases in a variety statistical problems, wavelets brought novel techniques and invigorated some of the existing ones.

## 7.3.1 A Case Study

We start first with a statistical application of wavelet transforms. This example emphasizes specificity of wavelet-based denoising not shared by standard state-of-art denoising techniques.

A researcher in geology was interested in predicting earthquakes by the level of water in nearby wells. She had a large ( measurements) data set of water levels taken every hour in a period of time of about one year in a California well. Here is the description of the problem.

The ability of water wells to act as strain meters has been observed for centuries. The Chinese, for example, have records of water flowing from wells prior to earthquakes. Lab studies indicate that a seismic slip occurs along a fault prior to rupture. Recent work has attempted to quantify this response, in an effort to use water wells as sensitive indicators of volumetric strain. If this is possible, water wells could aid in earthquake prediction by sensing precursory earthquake strain.

We have water level records from six wells in southern California, collected over a six year time span. At least  moderate size earthquakes (magnitude -) occurred in close proximity to the wells during this time interval. There is a significant amount of noise in the water level record which must first be filtered out. Environmental factors such as earth tides and atmospheric pressure create noise with frequencies ranging from seasonal to semidiurnal. The amount of rainfall also affects the water level, as do surface loading, pumping, recharge (such as an increase in water level due to irrigation), and sonic booms, to name a few. Once the noise is subtracted from the signal, the record can be analyzed for changes in water level, either an increase or a decrease depending upon whether the aquifer is experiencing a tensile or compressional volume strain, just prior to an earthquake.

A plot of the raw data for hourly measurements over one year ( observations) is given in Fig. 7.6a, with a close-up in Fig. 7.6b. After applying the wavelet transform and further processing the wavelet coefficients (thresholding), we obtained a fairly clean signal with a big jump at the earthquake time. The wavelet-denoised data are given in Fig. 7.6d. The magnitude of the water level change at the earthquake time did not get distorted in contrast to traditional smoothing techniques. This local adaptivity is a desirable feature of wavelet methods.

For example, Fig. 7.6c, is denoised signal after applying supsmo smoothing procedure. Note that the earthquake jump is smoothed, as well.

 (a) (b) (c) (d)

## 7.3.2 Continuous Wavelet Transform

The first theoretical results in wavelets had been concerned with continuous wavelet decompositions of functions and go back to the early 1980s. Papers of Morlet et al. (1982)[19] and Grossmann and Morlet (1984, 1985)[13,14] were among the first on this subject.

Let , be a family of functions defined as translations and re-scales of a single function ,

 (7.16)

Normalization constant ensures that the norm is independent of  and  The function  (called the wavelet function) is assumed to satisfy the admissibility condition,

 (7.17)

where is the Fourier transform of  The admissibility condition (7.17) implies

Also, if and for some , then .

Wavelet functions are usually normalized to ''have unit energy'', i.e., .

For example, the second derivative of the Gaussian function,

is an example of an admissible wavelet, called Mexican Hat or Marr's wavelet, see Fig. 7.7.

For any square integrable function , the continuous wavelet transform is defined as a function of two variables

Here the dilation and translation parameters, and , respectively, vary continuously over .

Figure 7.8 gives the doppler test function, , , and its continuous wavelet transform. The wavelet used was Mexican Hat. Notice the distribution of ''energy'' in the time/frequency plane in Fig. 7.8b.

 (a) (b)

#### Resolution of Identity.

When the admissibility condition is satisfied, i.e., , it is possible to find the inverse continuous transform via the relation known as resolution of identity or Calderón's reproducing identity,

The continuous wavelet transform of a function of one variable is a function of two variables. Clearly, the transform is redundant. To ''minimize'' the transform one can select discrete values of  and  and still have a lossless transform. This is achieved by so called critical sampling.

The critical sampling defined by

 (7.18)

will produce the minimal, but complete basis. Any coarser sampling will not produce a unique inverse transform. Moreover under mild conditions on the wavelet function , such sampling produces an orthogonal basis , . To formally describe properties of minimal and orthogonal wavelet bases a multiresolution formalism is needed.

## 7.3.3 Multiresolution Analysis

Fundamental for construction of critically sampled orthogonal wavelets is a notion of multiresolution analysis introduced by Mallat (1989a, 1989b)[16,17]. A multiresolution analysis (MRA) is a sequence of closed subspaces in such that they lie in a containment hierarchy

 (7.19)

The nested spaces have an intersection that contains only the zero function and a union that contains all square integrable functions.

(With we denoted the closure of a set ). The hierarchy (7.19) is constructed such that -spaces are self-similar,

 (7.20)

with the requirement that there exists a scaling function whose integer-translates span the space ,

and for which the family , is an orthonormal basis. It can be assumed that . With this assumption this integral is in fact equal to . Because of containment , the function can be represented as a linear combination of functions from , i.e.,

 (7.21)

for some coefficients , . This equation called the scaling equation (or two-scale equation) is fundamental in constructing, exploring, and utilizing wavelets.

Theorem 2   For the scaling function it holds

or, equivalently,

where is Fourier transform of , .

The coefficients in (7.21) are important in efficient application of wavelet transforms. The (possibly infinite) vector , will be called a wavelet filter. It is a low-pass (averaging) filter as will become clear later by its analysis in the Fourier domain.

To further explore properties of multiresolution analysis subspaces and their bases, we will often work in the Fourier domain.

It will be convenient to use Fourier domain for subsequent analysis of wavelet paradigm. Define the function as follows:

 (7.22)

The function in (7.22) is sometimes called the transfer function and it describes the behavior of the associated filter in the Fourier domain. Notice that the function  is -periodic and that filter taps , are in fact the Fourier coefficients in the Fourier serias of .

In the Fourier domain the relation (7.21) becomes

 (7.23)

where is the Fourier transform of . Indeed,

By iterating (7.23), one gets

 (7.24)

which is convergent under very mild conditions concerning the rates of decay of the scaling function .

Next, we prove two important properties of wavelet filters associated with an orthogonal multiresolution analysis, normalization and orthogonality.

#### Normalization.

 (7.25)

Proof:

Since by assumption, (7.25) follows.

This result also follows from .

#### Orthogonality.

For any

 (7.26)

Proof: Notice first that from the scaling equation (7.21) it follows that

By integrating the both sides in (7.27) we obtain

The last line is obtained by taking .

An important special case is for which (7.26) becomes

 (7.28)

The fact that the system , constitutes an orthonormal basis for can be expressed in the Fourier domain in terms of either or  .

In terms of ,

 (7.29)

From the Plancherel identity and the -periodicity of it follows

 (7.30)

The last line in (7.30) is the Fourier coefficient  in the Fourier series decomposition of

Due to the uniqueness of Fourier representation, . As a side result, we obtain that , , and . The last result follows from inspection of coefficients  in the Fourier decomposition of , the series . As this function is 1-periodic,

Remark 1   Utilizing the identity (7.29), any set of independent functions spanning , , , can be orthogonalized in the Fourier domain. The orthonormal basis is generated by integer-shifts of the function

 (7.31)

This normalization in the Fourier domain is used in constructing of some wavelet bases.

Orthogonality condition 7.29 can be expressed in terms of as:

 (7.32)

Since , then by (7.23)

 (7.33)

Now split the sum in (7.33) into two sums - one with odd and the other with even indices, i.e.,

To simplify the above expression, we use (7.29) and the -periodicity of  .

Whenever a sequence of subspaces satisfies MRA properties, there exists (though not unique) an orthonormal basis for  ,

 (7.34)

such that , -fixed, is an orthonormal basis of the ''difference space'' . The function is called a wavelet function or informally the mother wavelet.

Next, we discuss the derivation of a wavelet function from the scaling function. Since (because of the containment ), it can be represented as

 (7.35)

for some coefficients .

Define

 (7.36)

By mimicking what was done with , we obtain the Fourier counterpart of (7.35),

 (7.37)

The spaces and are orthogonal by construction. Therefore,

By repeating the Fourier series argument, as in (7.29), we conclude

By taking into account the definitions of  and , and by the derivation as in (7.32), we find

 (7.38)

From (7.38), we conclude that there exists a function  such that

 (7.39)

By substituting and by using the -periodicity of and , we conclude that

Any function of the form , where is an , -periodic function, will satisfy (7.38); however, only the functions for which will define an orthogonal basis of  .

To summarize, we choose such that

(i)
is -periodic,

(ii)
, and

(iii)
.

Standard choices for are , , and ; however, any other function satisfying (i)-(iii) will generate a valid . We choose to define as

 (7.41)

since it leads to a convenient and standard connection between the filters  and  .

The form of and (7.29) imply that , is an orthonormal basis for .

Since , the orthogonality condition (7.32) can be rewritten as

 (7.42)

By comparing the definition of in (7.36) with

we relate and as

 (7.43)

In signal processing literature, (7.43) is known as the quadrature mirror relation and the filters  and  as quadrature mirror filters.

Remark 2   Choosing leads to the rarely used high-pass filter . It is convenient to define as , where is a ''shift constant.'' Such re-indexing of  affects only the shift-location of the wavelet function.

## 7.3.4 Haar Wavelets

In addition to their simplicity and formidable applicability, Haar wavelets have tremendous educational value. Here we illustrate some of the relations discussed in the Sect. 7.3.3 using the Haar wavelet. We start with scaling function and pretend that everything else is unknown. By inspection of simple graphs of two scaled Haar wavelets and stuck to each other, we conclude that the scaling equation (7.21) is

which yields the wavelet filter coefficients:

The transfer functions are

and

Notice that (after ). Since , the Haar wavelet has linear phase, i.e., the scaling function is symmetric in the time domain. The orthogonality condition is easily verified, as well.

Relation (7.37) becomes

and by applying the inverse Fourier transform we obtain

in the time-domain. Therefore we ''have found'' the Haar wavelet function . From the expression for or by inspecting the representation of by and , we ''conclude'' that .

Although the Haar wavelets are well localized in the time domain, in the frequency domain they decay at the slow rate of and are not effective in approximating smooth functions.

## 7.3.5 Daubechies' Wavelets

The most important family of wavelets was discovered by Ingrid Daubechies and fully described in Daubechies (1992)[8]. This family is compactly supported with various degrees of smoothness.

The formal derivation of Daubechies' wavelets goes beyond the scope of this chapter, but the filter coefficients of some of its family members can be found by following considerations.

For example, to derive the filter taps of a wavelet with vanishing moments, or equivalently, filter taps, we use the following equations.

The normalization property of scaling function implies

requirement for vanishing moments for wavelet function leads to

and, finally, the orthogonality property can be expressed as

We obtained equations with unknowns; however the system is solvable since the equations are not linearly independent.

Example 4   For , we obtain the system:

which has a solution , , , and .

For , the system is

Figure 7.9 depicts two scaling function and wavelet pairs from the Daubechies family. Figure 7.9a,b depict the pair with two vanishing moments, while Fig. 7.9c,d depict the pair with four vanishing moments.

 (a) (b) (c) (d)

Next: 7.4 Discrete Wavelet Transforms Up: 7. Transforms in Statistics Previous: 7.2 Fourier and Related