Consider the well-known definition of the statistic with
degrees of freedom:
Nearly
years ago, Gossett used a kind of Monte Carlo
experiment (without using computers, since they were not yet
invented), before he analytically derived the density function of this
statistic (and published his results under the pseudonym of
Student). So, he sampled
values
(from an urn) satisfying
(3.2), and computed the corresponding value for the statistic
defined by (3.1). This experiment he repeated (say)
times,
so that he could compute the estimated density function (EDF) - also
called the empirical cumulative distribution function (ECDF) - of
the statistic. (Inspired by these empirical results, he did his famous
analysis.)
Let us imitate his experiment, in the following simulation experiment (this procedure is certainly not the most efficient computer program).
We may drop the classic assumption formulated in (3.2), and
experiment with non-normal distributions. It is easy to
sample from such distributions (see again
Chap. II.2). However, we are now confronted
with several so-called strategic choices (also see step 1
above): Which type of distribution should be selected (lognormal,
exponential, etc.); which parameter values for that distribution type
(mean and variance for the lognormal, etc.), which sample size (for
asymptotic, 'large' , the
distribution is known to be
a good approximation for our EDF).
Besides these choices, we must face some tactical issues:
Which number of macro-replicates gives a good EDF; can we use
special variance reducing techniques (VRTs) - such as common
random numbers and importance sampling - to reduce the variability of
the EDF? We explain these techniques briefly, as follows.
Common random numbers (CRN) mean that the analysts use the
same (pseudo)random numbers (PRN) - symbol - when estimating
the effects of different strategic choices. For example, CRN are used
when comparing the estimated quantiles
for
various distribution types. Obviously, CRN reduces the variance of
estimated differences, provided CRN creates positive correlations
between the estimators
being compared.
Antithetic variates (AV) mean that the analysts use the
complements of the PRN (
) in two 'companion'
macro-replicates. Obviously, AV reduces the variance of the estimator
averaged over these two replicates, provided AV creates negative
correlation between the two estimators resulting from the two
replicates.
Importance sampling (IS) is used when the analysts wish to
estimate a rare event, such as the probability of the Student
statistic exceeding the
quantile. IS increases that
probability (for example, by sampling from a distribution with
a fatter tail) - and later on, IS corrects for this distortion of the
input distribution (through the likelihood ratio). IS is not so simple
as CRN and AV - but without IS too much computer time may be
needed. See Glasserman et al. (2000).
There are many more VRTs. Both CRN and AV are intuitively attractive and easy to implement, but the most popular one is CRN. The most useful VRT may be IS. In practice, the other VRTs often do not reduce the variance drastically so many users prefer to spend more computer time instead of applying VRTs. (VRTs are a great topic for doctoral research!) For more details on VRTs, I refer to Kleijnen and Rubinstein (2001).
Finally, the density function of the sample data may not be
an academic problem: Suppose a very limited set of historical data is
given, and we must analyze these data while we know that these data do
not satisfy the classic assumption formulated in (3.2). Then
bootstrapping may help, as follows (also remember the six
steps above).