As an ''appetizer'' we give two simple examples of use of
transformations in statistics, Fisher and Box-Cox transformations
as well as the empirical Fourier-Stieltjes transform.
![]() |
![]() |
![]() |
![]() |
Let
be a sample from
bivariate normal distribution
, and
,
.
The Pearson coefficient of linear correlation
![]() |
![]() |
![]() |
|
![]() |
||
![]() |
![]() |
![]() ![]() |
To exemplify the above, we generated pairs of normally
distributed random samples with theoretical correlation
. This was done by generating two i.i.d. normal
samples
, and
of length
and taking the transformation
,
. The sample correlation coefficient
is found.
This was repeated
times. The histogram of
sample correlation coefficients is shown in
Fig. 7.1a. The histogram of
-transformed
's is
shown in Fig. 7.1b with superimposed normal
approximation
.
(i) For example,
confidence interval for
is:
![]() |
If
and
,
and
. In terms of
the
confidence interval is
.
(ii) Assume that two samples of size and
, respectively,
are obtained form two different bivariate normal populations. We are
interested in testing
against the two sided
alternative. After observing
and
and transforming them to
and
, we conclude that the
-value of the test is
.
As an illustration, we apply the Box-Cox transformation to apparently skewed data of CEO salaries.
Forbes magazine published data on the best small firms in 1993. These
were firms with annual sales of more than five and less than
million. Firms were ranked by five-year average return on
investment. One of the variables extracted is the annual salary of the
chief executive officer for the first
ranked firms (since one
datum is missing, the sample size is
). Figure 7.2a
shows the histogram of row data (salaries). The data show moderate
skeweness to the right. Figure 7.2b gives the values of
likelihood in (7.2) for different values of
. Note
that (7.2) is maximized for
approximately equal
to
. Figure 7.2c gives the transformed data by
Box-Cox transformation with
. The histogram of
transformed salaries is notably symetrized.
The characteristic function of a probability distribution is
defined as its Fourier-Stieltjes transform,
![]() |
(7.3) |
For a sample
one defines empirical
characteristic function
as
![]() |
![]() |
![]() |
![]() |
Following Murata (2001)[20] we describe how the empirical characteristic function can be used in testing for the independence of two components in bivariate distributions.
Given the bivariate sample
,
, we are
interested in testing for independence of the components
and
.
The test can be based on the following bivariate process,
![]() |
Murata (2001)[20] shows that has Gaussian weak
limit and that
![]() |
|
![]() |
![]() |
![]() |
|
![]() |
|
![]() |
![]() ![]() ![]() |
We generated two independent components from the Beta()
distribution of size
and found
statistics and
corresponding
-values
times. Figure 7.3a,b
depicts histograms of
statistics and
values based on
simulations. Since the generated components
and
are
independent, the histogram for
agrees with asymptotic
distribution, and of course, the
-values are uniform on
.
In Fig. 7.3c we show the
-values when the components
and
are not independent. Using two independent Beta(
)
components
and
, the second component
is constructed as
. Notice that for majority of
simulational runs the independence hypothesis is rejected, i.e., the
-values cluster around 0.