As mentioned earlier, a statistical test for RNGs is defined by
a random variable whose distribution under
can be well
approximated. When
takes the value
, we define the right and
left
-values of the test by
![]() |
Most tests are defined by partitioning the possible realizations of
into a finite number of subsets (where
the integer
can be random or deterministic), computing the
probability
of each subset
under
, and measuring the
discrepancy between these probabilities and empirical frequencies from
realizations simulated by the RNG.
A special case that immediately comes to mind is to take
(a constant) and cut the interval
into
equal segments for
some positive integer
, in order to partition the hypercube
into
subcubes of volume
. We then generate
points
, for
, and count the number
of points falling in
subcube
, for
. Any measure of distance (or
divergence) between the numbers
and their expectations
can
define a test statistic
. The tests thus defined are generally
called serial tests of uniformity ([31,60]).
They can be sparse (if
), or dense (if
), or somewhere in between. There are also
overlapping versions, where the points are defined by
for
(they have
overlapping coordinates).
Special instances for which the distribution of under
is
well-known are the chi-square, the (negative) empirical entropy,
and the number of collisions ([51,60,85]). For the
latter, the test statistic
is the number of times a point falls in
a subcube that already had a point in it. Its distribution under
is approximately Poisson with mean
, if
is large and
is not too large.
A variant is the birthday spacings test, defined as follows
([71,31,57]).
Let
be the numbers of the subcubes
that contain the points, sorted by increasing order. Define the
spacings
, for
, and
let
be the number of collisions between these spacings. Under
,
is approximately Poisson with mean
, if
is large and
not too large.
Consider now a MRG, for which has a regular lattice
structure. Because of this regularity the points of
will
tend to be more evenly distributed among the subcubes than random
points. For a well-chosen
and large enough
, we expect the
collision test to detect this: it is likely that there will be too few
collisions.
In fact, the same applies to any RNG whose set
is very evenly
distributed. When a birthday spacings test with a very large
is
applied to a MRG, the numbers of the subcubes that contain one point
of
tend to be too evenly spaced and the test detects this by
finding too many collisions.
These specific interactions between the test and the structure of the
RNG lead to systematic patterns in the -values of the tests. To
illustrate this, suppose that we take
slightly larger than the
cardinality of
(so
) and that due to the
excessive regularity, no collision is observed in the collision test.
The left
-value will then be
Poisson
. For this
-value to be
smaller than a given
, we need a sample size
proportional to the square root of the period length
. And
after that,
decreases exponentially fast in
.
Extensive experiments with LCGs, MRGs, and LFSR generators confirms
that this is actually what happens with these RNGs
([51,44,60]). For example, if we take
and define
as the minimal sample size
for which
, we find that
(plus
some noise) for LCGs that behave well in the spectral test as well as
for LFSR generators.
For the birthday spacings test, the rule for LCGs is
instead ([57]). So to be safe with respect to
these tests, the period length
must be so large that generating
more than
numbers is practically unfeasible. This
certainly disqualifies all LCGs with modulus smaller than
or
so.
Other types of tests for RNGs include tests based on the closest pairs
of points among points generated in the hypercube, tests based on
random walks on the real line or over the integers, tests based on the
linear complexity of a binary sequence, tests based on the simulation
of dices or poker hands, and many others
([23,31,58,72,86,92]).
When testing RNGs, there is no specific alternative hypothesis to
. Different tests are needed to detect different types of
departures from
. Test suites for RNGs include
a selection of tests, with predetermined parameters and sample sizes.
The best known are probably DIEHARD ([72]) and the NIST test
suite ([86]). The library TestU01 ([58])
implements a large selection of tests in the C language and provides
a variety of test suites, some designed for i.i.d.
output
sequences and others for strings of bits.