next up previous contents index
Next: 7.3 Wavelets and Other Up: 7. Transforms in Statistics Previous: 7.1 Introduction

Subsections



7.2 Fourier and Related Transforms

Functional series have a long history that can be traced back to the early nineteenth century. French mathematician (and politician) Jean-Baptiste-Joseph Fourier, decomposed a continuous, periodic on $ [-\pi, \pi]$ function $ f(x)$ into the series od sines and cosines,

$\displaystyle \frac{a_0}{2} + {\displaystyle\mathop {\sum}_{n=1}^{\infty}} a_n \cos nx + b_n \sin nx{},$    

where the coefficients $ a_n$ and $ b_n$ are defined as

$\displaystyle a_n = \frac{1}{\pi} \int_{-\pi}^{\pi} f(x) \cos n x\, \mathrm{d} x{},\quad n=0,1,2,\ldots$    
$\displaystyle b_n = \frac{1}{\pi} \int_{-\pi}^{\pi} f(x) \sin n x\, \mathrm{d} x{},\quad n=1,2,\ldots{}.$    

The sequences $ \{a_n, ~n=0,1,\ldots \}$ and $ \{b_n,~n=1,2, \ldots \}$ can be viewed as a transform of the original function $ f$. It is interesting that at the time of Fourier's discovery the very notion of function was not precisely defined. Fourier methods have long history in statistics especially in the theory of nonparametric function and density estimation and characteristic functions.

There are three types of Fourier transforms: integral, serial, and discrete. Next, we focus on discrete transforms and some modifications of the integral transform.

7.2.1 Discrete Fourier Transform

The discrete Fourier transform (DFT) of a sequence $ {\boldsymbol{f}}= \{f_n, ~n=0,1,\ldots, N-1 \}$ is defined as

$\displaystyle {\boldsymbol{F}} = \left\{ \sum_{n=0}^{N-1} f_n w_N^{nk}{}, \quad k=0, \ldots, N-1 \right\}{},$    

where $ w_N = e^{-i 2 \pi/N}.$ The inverse is

$\displaystyle {\boldsymbol{f}} = \left\{ \frac{1}{N} \sum_{k=0}^{N-1} F_k w_N^{-nk}{}, \quad n=0, \ldots, N-1 \right\}{}.$    

The DFT can be interpreted as the multiplication of the input vector by a matrix; therefore, the discrete Fourier transform is a linear operator. If $ Q=\{Q_{nk}= \mathrm{e}^{-\mathrm{i} 2 \pi n k}\}_{N \times N}$, then $ {\boldsymbol{F}} = Q \cdot {\boldsymbol{f}}$. The matrix $ Q$ is unitary (up to a scale factor), i.e., $ Q^{\ast} Q = N I$, where $ I$ is the identity matrix and $ Q^{\ast}$ is the conjugate transpose of $ Q$.

There are many uses of discrete Fourier transform in statistics. It turns cyclic convolutions into component-wise multiplication, and the fast version of DFT has a low computational complexity of $ O(n
\log(n))$, meaning that the number of operations needed to transform an input of size $ n$ is proportional to $ n \log(n)$. For a theory and various other uses of DFT in various fields reader is directed to Brigham (1988).[5]

We focus on estimation of a spectral density from the observed data, as an important statistical task in a variety of applied fields in which the information about frequency behavior of the phenomena is of interest.

Let $ \{X_t, t \in Z \}$ be a a real, weakly stationary time series with zero mean and autocovariance function $ \gamma(h)=E X(t+h)
X(t)$. An absolutely summable complex-valued function $ \gamma(\cdot)$ defined on integers is the autocovariance function of $ X_t$ if and only if the function

$\displaystyle f(\omega) = \frac{1}{2 \pi} \sum_{h=-\infty}^{\infty} \gamma(h) \mathrm{e}^{-\mathrm{i} h \omega}$ (7.4)

is non-negative for all $ \omega \in [-\pi, \pi]$. The function $ f(\omega)$ is called the spectral density associated with covariance function $ \gamma(h)$, and is in fact a version of discrete Fourier transform of the autocovariance function $ \gamma(h)$. The spectral density of a stationary process is a symmetric and non-negative function. Given the spectral density, the autocovariance function can uniquely be recovered via inverse Fourier transform,

$\displaystyle \gamma(h) = \int_{-\pi}^{\pi} f(\omega) \mathrm{e}^{\mathrm{i} h \omega} d\omega{}, \quad h = 0, \pm 1, \pm2, \ldots{}.$    

A traditional statistic used as an estimator of the spectral density is the periodogram. The periodogram $ I(\omega)$, based on a sample $ X_0, \ldots, X_{T-1}$ is defined as

$\displaystyle I(\omega_j)= \frac{1}{2 \pi T} \left \vert\sum_{t=0}^{T-1} X_t \mathrm{e}^{-\mathrm{i} t\omega_j}\right \vert^2{},$ (7.5)

where $ \omega_j$ is the Fourier frequency $ \omega_j = 2\pi j/T$, $ j=[-T/2]+1, \ldots,-1, 0,1, \ldots, [T/2]$. By a discrete version of the sampling theorem it holds that $ I(\omega)$ is uniquely determined for all $ \omega \in [-\pi, \pi]$, given its values at Fourier frequencies.

Calculationally, the periodogram is found by using fast Fourier transform. A simple matlab m-function calculating the periodogram is

function out = periodogram(ts)
out = abs(fftshift(fft(ts - mean(ts)))).^2/(2*pi*length(ts));

An application of spectral and log-spectral estimation involves famous Wolf's sunspot data set. Although in this situation the statistician does not know the ''true'' signal, the theory developed by solar scientists helps to evaluate performance of the algorithm.

The Sun's activity peaks every $ 11$ years, creating storms on the surface of our star that disrupt the Earth's magnetic field. These ''solar hurricanes'' can cause severe problems for electricity transmission systems. An example of influence of such periodic activity to everyday life is 1989 power blackout in the American northeast.

Efforts to monitor the amount and variation of the Sun's activity by counting spots on it have a long and rich history. Relatively complete visual estimates of daily activity date back to 1818, monthly averages can be extrapolated back to 1749, and estimates of annual values can be similarly determined back to 1700. Although Galileo made observations of sunspot numbers in the early 17th century, the modern era of sunspot counting began in the mid-1800s with the research of Bern Observatory director Rudolph Wolf, who introduced what he called the Universal Sunspot Number as an estimate of the solar activity. The square root of Wolf's yearly sunspot numbers are given in Fig. 7.4a, data from Tong (1996)[22], p. 471. Because of wavelet data processing we selected a sample of size a power of two, i.e., only $ 256$ observations from 1733 till 1998. The square root transformation was applied to symmetrize and de-trend the Wolf's counts. Figure 7.4b gives a raw periodogram, while Fig. 7.4c shows the estimator of log-spectral density (Pensky and Vidakovic, 2003).[21]

Figure 7.4: (a) Square roots of Wolf's yearly sunspot numbers from 1732-1988 ($ 256$ observations); (b) Raw periodogram; (c) An estimator of the log-spectra. The frequency $ \omega ^{\ast } \approx {0.58}$ corresponds to Schwabe's period of $ {10.8}$ (years)
\includegraphics[width=3.6cm]{text/2-7/wolferyear1.eps}(a) \includegraphics[width=3.5cm]{text/2-7/wolferyear2.eps}(b) \includegraphics[width=3.6cm]{text/2-7/wolferyear3.eps}(c)

The estimator reveals a peak at frequency $ \omega ^{\ast } \approx {0.58}$, corresponding to the Schwabe's cycle ranging from $ 9$ to $ 11.5$ (years), with an average of $ 2\pi/{0.58}\approx
{10.8}$ years. The Schwabe cycle is the period between two subsequent maxima or minima the solar activity, although the solar physicists often think in terms of a $ 22$-year magnetic cycle since the sun's magnetic poles reverse direction every $ 11$ years.

7.2.2 Windowed Fourier Transform

Windowed Fourier Transforms are important in providing simultaneous insight in time and frequency behavior of the functions. Standard Fourier Transforms describing the data in the ''Fourier domain'' are precise in frequency, but not in time. Small changes in the signal (data) at one location cause change in the Fourier domain globally. It was of interest to have transformed domains that are simultaneously precise in both time and frequency domains. Unfortunately, the precision of such an insight is limited by the Heisenberg's Uncertainty Principle.

Suppose $ f(t)$ is a signal of finite energy. In mathematical terms, the integral of its modulus squared is finite, or shortly, $ f$ belongs to $ {{\mathbb{L}}}_2({{\mathbb{R}}})$ space.

The integral Fourier transform of the signal

$\displaystyle \mathcal{F}(\kern.5pt f)(\xi) = \hat{f}(\xi) = \int_{{{\mathbb{R}}}} f(t) \mathrm{e}^{-\i t\xi} \mathrm{d} t{},$ (7.6)

describes the allocation of energy content of a signal at different frequencies, but the time-related information is lost.

Windowed Fourier transform (also called short time Fourier transform, STFT) was introduced by Gabor (1946)[12], to measure time-localized frequencies of sound. An atom in Gabor's decomposition is defined via:

$\displaystyle g_{u, \xi} (t) = \mathrm{e}^{\mathrm{i} \xi t} g(t - u){},$    

where $ g$ is a real, symmetric, and properly normalized ''window'' function. [$ \vert\vert g \vert\vert=1$ so that $ \vert\vert g_{u, \xi} \vert\vert = 1$]

If $ f \in {{\mathbb{L}}}_2({{\mathbb{R}}})$, then windowed Fourier transform is defined as

$\displaystyle S f(u, \xi) = \langle f, g_{u, \xi} \rangle = \int_{{{\mathbb{R}}}} f(t) g(t-u) \mathrm{e}^{-\mathrm{i} \xi t} \mathrm{d} t{}.$ (7.7)

The chief use of windowed Fourier transforms is to analyze time/frequency distribution of signal energy, via spectrogram.

The spectrogram,

$\displaystyle P_S f(u, \xi) = \vert S f(u, \xi)\vert^2 = \left\vert \int_{-\inf...
...infty} f(t) g(t-u) \mathrm{e}^{-\mathrm{i} \xi t} \mathrm{d} t \right\vert^2{},$    

expresses the energy distribution in the signal $ f$, with respect to time and frequency simultaneously.

The following are some basic properties of STFT. Let $ f \in
{{\mathbb{L}}}_2({{\mathbb{R}}}^2)$. Then

$\displaystyle \mathrm{[Inverse STFT]}\qquad f(t) = \frac{1}{2 \pi} \int_{{{\mat...
...} Sf(u, \xi) g(t-u) \mathrm{e}^{\mathrm{i} \xi t} \mathrm{d}\xi \mathrm{d} u{},$ (7.8)

and

$\displaystyle \mathrm{[Energy Conservation]}\qquad \int_{{{\mathbb{R}}}} \left\...
...{\mathbb{R}}}} \left\vert Sf(u, \xi)\right\vert^2 \mathrm{d}\xi \mathrm{d} u{}.$ (7.9)

The following is a characterizing property of STFT:

Let $ \Phi \in {{\mathbb{L}}}_2({{\mathbb{R}}}^2)$. There exist $ f \in
{{\mathbb{L}}}_2({{\mathbb{R}}}^2)$ such that $ \Phi(u, \xi) = Sf(u, \xi)$ if and only if

$\displaystyle \Phi\left(u_0, \xi_0\right) = \frac{1}{2 \pi} \int_{{{\mathbb{R}}...
...xi) {{\mathbb{K}}}\left(u_0, u, \xi_0, \xi\right) \mathrm{d} u \mathrm{d}\xi{},$ (7.10)

where

$\displaystyle {{\mathbb{K}}}(u_0, u, \xi_0, \xi) = \left\langle g_{u,\xi}, g_{u...
...{R}}}} g(t-u) g(t-u_0) \mathrm{e}^{-\mathrm{i} (\xi_0 - \xi) t} \mathrm{d} t{}.$ (7.11)

7.2.3 Hilbert Transform

We next describe the Hilbert transform and its use in defining instantaneous frequency, an important measure in statistical analysis of signals.

The Hilbert transform of the function signal $ g(t)$ is defined by

$\displaystyle H_g(t) = \frac{1}{\pi} (VP) \int_{-\infty}^{\infty} \frac{g(\tau)}{t - \tau} \mathrm{d}\tau {}.$ (7.12)

Because of the possible singularity at $ \tau=t$, the integral is to be considered as a Cauchy principal value, (VP). From (7.12) we see that $ H_g(t)$ is a convolution, $ 1/(\pi t) * g(t)$.

The spectrum of $ H_g(t)$ is related to that of $ g(t)$. From the convolution equation,

$\displaystyle \mathcal{F} (H(t)) = \mathcal{F} \left( \frac{1}{\pi t} \right) \mathcal{F} (g(t)){}.$    

where $ \mathcal{F}$ is the Fourier transform. With a real signal $ g(t)$ one can associate a complex function with the real part equal to $ g(t)$ and the imaginary part equal to $ H(g(t))$, $ h(t) = g(t) - \mathrm{i}
H(g(t))$.

In statistical signal analysis this associated complex function $ h(t)$ is known as analytic signal (or causal signal, since $ \hat{h}(\xi) =
0$, for $ \xi < 0$). Analytic signals are important since they possess unique phase $ \phi(t)$ which leads to the definition of the instantaneous frequency.

If $ h(t)$ is represented as $ a(t) \cdot \exp\{\mathrm{i} \phi(t) \}$, then the quantity $ \mathrm{d} \phi/\mathrm{d} t$ is instantaneous frequency of the signal $ g(t)$, at time $ t$. For more discussion and use of instantaneous frequency, the reader is directed to Flandrin (1992, 1999).[10,11]

7.2.4 Wigner-Ville Transforms

Wigner-Ville Transform (or Distribution) is the method to represent data (signals) in the time/frequency domain. In statistics, Wigner-Ville transform provide a tool to define localized spectral density for the nonstationary processes.

Figure 7.5: (a) Sonar signal from flying bat; (b) its Wigner-Ville transform
\includegraphics[width=5.1cm]{text/2-7/figure3a.eps}(a) \includegraphics[width=5.1cm]{text/2-7/figure3b.eps}(b)

Ville (1948)[24] introduced the quadratic form that measures a local time-frequency energy:

$\displaystyle P_V f(u, \xi) = \int_{{{\mathbb{R}}}} f\left(u + \frac{\tau}{2} \...
...t(u - \frac{\tau}{2}\right) \mathrm{e}^{-\mathrm{i} \tau \xi} \mathrm{d}\tau{},$    

where $ f^{\ast}$ is conjugate of $ f$.

The Wigner-Ville transform is always real since $ f(u +
\frac{\tau}{2} ) f^{\ast}(u - \frac{\tau}{2})$ has Hermitian symmetry in $ \tau$.

Time and frequency are symmetric in $ P_V f(u, \xi)$, by applying Parseval formula one gets,

$\displaystyle P_V f(u, \xi) = \frac{1}{2 \pi} \int_{{{\mathbb{R}}}} \hat{f}\lef...
...- \frac{\gamma}{2}\right) \mathrm{e}^{-\mathrm{i} \gamma u} \mathrm{d}\gamma{},$ (7.13)

For any $ f \in {{\mathbb{L}}}_2({{\mathbb{R}}})$

$\displaystyle \int_{{{\mathbb{R}}}} P_V f(u, \xi) \mathrm{d} u = \vert\kern.5pt \hat{f}(\xi)\vert^2{},$ (7.14)

i.e., the time marginalization reproduces power spectrum, and

$\displaystyle \int_{{{\mathbb{R}}}} P_V f(u, \xi) \mathrm{d} \xi = 2 \pi \vert\kern.5pt f(u)\vert^2{},$ (7.15)

i.e, the frequency marginalization is proportional to the squared modulus of the signal.

Integral (7.13) states that one-dimensional Fourier transform of $ g_{\xi}(u)= P_V f(u, \xi)$, with respect to $ u$ is,

$\displaystyle \hat{g}_{\xi}(\gamma) = \hat{f}\left(\xi + \frac{\gamma}{2}\right) \hat{f}^{\ast} \left(\xi - \frac{\gamma}{2}\right){}.$    

If $ \gamma=0$, $ \hat{g}_{\xi}(0) = \int_{{{\mathbb{R}}}} g_\xi(u) d u$, which proves (7.14). Similarly for (7.15).

For example,

(i)
if $ f(t) = \boldsymbol{1}(-T \leq t \leq T)$, then

$\displaystyle P_V f(u, \xi) = \frac{2 \sin[ 2 (T-\vert u\vert) \xi ]}{\xi} \boldsymbol{1}(-T \leq u \leq T){}.$    

Plot $ P_V f(u, \xi)$.

(ii)
if $ f(t) = \exp\{ \mathrm{i} \lambda (t + \alpha t^2/2)\}$, then $ P_V(u,
\xi) = 2 \pi \delta(\xi - \lambda(1 + \alpha u))$.

(iii)
a Gaussian $ f(t) = (\sigma^2 \pi)^{-1/4} ~\exp (-t^2/(2
\sigma^2))$ is transformed into

$\displaystyle P_V f(u, \xi) = \frac{1}{\pi} \exp\left( -\frac{u^2}{\sigma^2} - \sigma^2 \xi^2 \right){}.$    

In this case, $ P_V f(u, \xi)=\vert\kern.5pt f(u)\vert^2 \cdot \vert\kern.5pt
\hat{f}(\xi)\vert^2$. The Gaussian is the only (up to time and frequency shifts) distribution for which Wigner-Ville transform remains positive. Some basic properties of Wigner-Ville transforms are listed in Table 7.1.


Table 7.1: Properties of Wigner-Ville transform
Function Wigner-Ville
$ f(t)$ $ P_V f(u, \xi)$
$ \mathrm{e}^{\mathrm{i}\phi} f(t) $ $ P_V f(u, \xi)$
$ f(t - u_0) $ $ P_V f(u-u_0, \xi)$
$ \mathrm{e}^{\mathrm{i}\xi_0 t} f(t) $ $ P_V f(u, \xi - \xi_0)$
$ \mathrm{e}^{\mathrm{i} at^2 } f(t) $ $ P_V f(u, \xi - 2 au)$
$ \frac{1}{\sqrt{s}} f(t/s) $ $ P_V f(u/s, s \xi)$

Next we show that expected value of Wigner-Ville transform of a random process can serve as a definition for generalized spectrum of a non-stationary process. Let $ X(t)$ be real-valued zero-mean random process with covariance function

$\displaystyle E X(t) X(s) = R(t ,s) = R\left(u + \frac{\tau}{2}, u - \frac{\tau}{2}\right)= C(u, \tau){},$    

after substitution $ \tau = t-s$ and $ u = (t+s)/2$.

Now, if the process $ X(t)$ is stationary, then $ C(u, \tau)$ is a function of $ \tau$ only and

$\displaystyle P_X(\xi) = \int_{-\infty}^{\infty} C(\tau) \mathrm{e}^{- \mathrm{i} \xi \tau} \mathrm{d} \tau$    

is its power spectrum.

For arbitrary process Flandrin (1999)[11] defined ''power spectrum'' as

$\displaystyle P_X(\xi) = \int_{-\infty}^{\infty} C(u,\tau) \mathrm{e}^{- \mathrm{i} \xi \tau} d \tau {}.$    

Thus, $ P_X(\xi)$ can be represented as $ \mathrm{E}\, P_V X(u,\xi)$, where

$\displaystyle P_V X(u,\xi) = \int_{-\infty}^{\infty} X\left(u + \frac{\tau}{2}\...
...(u - \frac{\tau}{2}\right)\mathrm{e}^{- \mathrm{i} \xi \tau} \mathrm{d} \tau{}.$    

For more information on Wigner-Ville transforms and their statistical use the reader is directed to Baraniuk (1994), Carmona, Hwang and Torresani (1998) Flandrin (1999), and Mallat (1999)[3][6][11][18], among others.


next up previous contents index
Next: 7.3 Wavelets and Other Up: 7. Transforms in Statistics Previous: 7.1 Introduction