Next: 2.4 Generators Based on Up: 2. Random Number Generation Previous: 2.2 Uniform Random Number

Subsections

# 2.3 Linear Recurrences Modulo

## 2.3.1 The Multiple Recursive Generator

The most widely used RNGs are based on the linear recurrence

 (2.2)

where and are positive integers called the modulus and the order, and the coefficients are in  , interpreted as the set on which all operations are performed with reduction modulo . The state at step  is . When is a prime number, the finite ring is a finite field and it is possible to choose the coefficients so that the period length reaches (the largest possible value) ([31]). This maximal period length is achieved if and only if the characteristic polynomial of the recurrence, , is a primitive polynomial over , i.e., if and only if the smallest positive integer such that is . [31] explains how to verify this for a given . For , for to be a primitive polynomial, it is necessary that and at least another coefficient be nonzero. Finding primitive polynomials of this form is generally easy and they yield the simplified recurrence:

 (2.3)

multiple recursive generator (MRG) uses (2.2) with a large value of and defines the output as . For , this is the classical linear congruential generator (LCG). In practice, the output function is modified slightly to make sure that never takes the value 0 or  (e.g., one may define , or if and otherwise) but to simplify the theoretical analysis, we will follow the common convention of assuming that (in which case does take the value 0 occasionally).

## 2.3.2 The Lattice Structure

Let denote the th unit vector in dimensions, with a  in position and 0's elsewhere. Denote by the values of produced by the recurrence (2.2) when the initial state is . An arbitrary initial state can be written as the linear combination and the corresponding sequence is a linear combination of the sequences , with reduction of the coordinates modulo . Reciprocally, any such linear combination reduced modulo  is a sequence that can be obtained from some initial state . If we divide everything by we find that for the MRG, for each , where

is a -dimensional lattice in , with basis

For , contains all vectors whose coordinates are multiples of . For , it contains a fraction of those vectors.

This lattice structure implies that the points of are distributed according to a very regular pattern, in equidistant parallel hyperplanes. Graphical illustrations of this, usually for LCGs, can be found in a myriad of papers and books; e.g., [23,31,34], and [40]. Define the dual lattice to as

 for all

Each is a normal vector that defines a family of equidistant parallel hyperplanes, at distance apart, and these hyperplanes cover all the points of unless is an integer multiple of some other vector . Therefore, if is the euclidean length of a shortest non-zero vector in , then there is a family of hyperplanes at distance apart that cover all the points of . A small means thick slices of empty space between the hyperplanes and we want to avoid that. A large means a better (more uniform) coverage of the unit hypercube by the point set . Computing the value of is often called the spectral test ([31,21]).

The lattice property holds as well for the point sets formed by values at arbitrary lags defined by a fixed set of indices . One has for some lattice , and the largest distance between successive hyperplanes for a family of hyperplanes that cover all the points of is , where is the euclidean length of a shortest nonzero vector in , the dual lattice to .

The lattice and its dual can be constructed as explained in [10] and [49]. Finding the shortest nonzero vector in a lattice with basis can be formulated as an integer programming problem with a quadratic objective function:

 Minimize

subject to integers and not all zero. This problem can be solved by a branch-and-bound algorithm ([20,49,88]).

For any given dimension and points per unit of volume, there is an absolute upper bound on the best possible value of ([8,31,42]). Let denote such an upper bound. To define a figure of merit that takes into account several sets , in different numbers of dimensions, it is common practice to divide by an upper bound, in order to obtain a standardized value between 0 and , and then take the worst case over a given class of sets . This gives a figure of merit of the form

A value of too close to zero means that has a bad lattice structure for at least one of the selected sets . We want a value as close to  as possible. Computer searches for good MRGs with respect to this criterion have been reported by [47,46,41], for example. In most cases, was simply the sets of the form for , where was an arbitrary integer ranging from  to . [52] also consider the small dimensional sets  with indices not too far apart. They suggest taking for some positive integers . We could also take a weighted average instead of the minimum in the definition of  .

An important observation is that for , the -dimensional vector always belongs to , because for any vector , the first coordinates of must satisfy the recurrence (2.2), which implies that must be an integer. Therefore, one always has . Likewise, if contains 0 and all indices such that , then ([39]). This means that the sum of squares of the coefficients must be large if we want to have any chance that the lattice structure be good.

Contructing MRGs with only two nonzero coefficients and taking these coefficients small has been a very popular idea, because this makes the implementation easier and faster ([14,31]). However, MRGs thus obtained have a bad structure. As a worst-case illustration, consider the widely-available additive or subtractive lagged-Fibonacci generator, based on the recurrence (2.2) where the two coefficients and are both equal to . In this case, whenever contains , one has , so the distance between the hyperplanes is at least . In particular, for , all the points of (aside from the zero vector) are contained in only two planes! This type of structure can have a dramatic effect on certain simulation problems and is a good reason for staying away from these lagged-Fibonacci generators, regardless of their parameters.

A similar problem occurs for the ''fast MRG'' proposed by [14], based on the recurrence

with . If is small, the bound implies a bad lattice structure for . A more detailed analysis by [63] shows that this type of generator cannot have a good lattice structure even if the condition is removed. Another special case proposed by [15] has the form

 (2.4)

In this case, for , the vectors and both belong to the dual lattice  , where is the multiplicative inverse of  modulo . So neither nor should be small.

To get around this structural problem when contains certain sets of indices, [70] and [31] recommend to skip some of the output values in order to break up the bad vectors. For the lagged-Fibonacci generator, for example, one can output successive values produced by the recurrence, then skip the next values, output the next , skip the next , and so on. A large value of  (e.g., or more) may get rid of the bad structure, but slows down the generator. See [98] for further discussion.

## 2.3.3 MRG Implementation Techniques

The modulus is often taken as a large prime number close to the largest integer directly representable on the computer (e.g., equal or near for 32-bit computers). Since each can be as large as , one must be careful in computing the right side of (2.2) because the product is typically not representable as an ordinary integer. Various techniques for computing this product modulo are discussed and compared by [21,48,41], and [56]. Note that if , using is equivalent to using the negative coefficient , which is sometimes more convenient from the implementation viewpoint. In what follows, we assume that can be either positive or negative.

One approach is to perform the arithmetic modulo  in -bit (double precision) floating-point arithmetic ([41]). Under this representation, assuming that the usual IEEE floating-point standard is respected, all positive integers up to are represented exactly. Then, if each coefficient is selected to satisfy , the product will always be represented exactly and can be computed by the instructions

Similarly, if , will always be represented exactly.

A second technique, called approximate factoring ([48]), uses only the integer representation and works under the condition that or for some integer . One precomputes and . Then, can be computed by

 if   then   else

All quantities involved in these computations are integers between and , so no overflow can occur if can be represented as an ordinary integer (e.g., on a -bit computer).

The powers-of-two decomposition approach selects coefficients that can be written as a sum or difference of a small number of powers of  ([99,56,62]). For example, one may take and for some positive integers , , , and . To compute , decompose (where mod) and observe that

Suppose now that

 and (2.5)

Then, and , so can be computed by shifts, masks, additions, subtractions, and a single multiplication by . Intermediate results never exceed . Things simplify further if or or . For , is obtained simply by swapping the blocks of bits and ([99]). It has been pointed out by [56] that LCGs with parameters of the form and have bad statistical properties because the recurrence does not ''mix the bits'' well enough. However, good and fast MRGs can be obtained via the power-of-two decomposition method, as explained in [62].

Another interesting idea for improving efficiency is to take all nonzero coefficients equal to the same constant  ([72,15]). Then, computing the right side of (2.2) requires a single multiplication. [15] provide specific parameter sets and concrete implementations for MRGs of this type, for prime  near , and , , and .

One may be tempted to take equal to a power of two, say , because then the '' mod'' operation is much easier: it suffices to keep the least significant bits and mask-out all others. However, taking a power-of-two modulus is not recommended because it has several strong disadvantages in terms of the quality of the RNG ([35,40]). In particular, the least significant bits have very short periodicity and the period length of the recurrence (2.2) cannot exceed if , and if and . The maximal period length achievable with and , for example, is more than times smaller than the maximal period length achievable with and (a prime number).

## 2.3.4 Combined MRGs and LCGs

The conditions that make MRG implementations run faster (e.g., only two nonzero coefficients both close to zero) are generally in conflict with those required for having a good lattice structure and statistical robustness. Combined MRGs are one solution to this problem. Consider  distinct MRGs evolving in parallel, based on the recurrences

 (2.6)

where , for . Let be arbitrary integers,

 (2.7)

and

 (2.8)

This defines two RNGs, with output sequences and .

Suppose that the are pairwise relatively prime, that and have no common factor for each , and that each recurrence (2.6) is purely periodic with period length . Let and let be the least common multiple of . Under these conditions, the following results have been proved by [61] and [37]: (a) the sequence (2.8) is exactly equivalent to the output sequence of a MRG with (composite) modulus  and coefficients  that can be computed explicitly as explained in [37]; (b) the two sequences in (2.7) and (2.8) have period length ; and (c) if both sequences have the same initial state, then where can be bounded explicitly by a constant which is very small when the are close to each other.

Thus, these combined MRGs can be viewed as practical ways of implementing an MRG with a large and several large nonzero coefficients. The idea is to cleverly select the components so that: (1) each one is easy to implement efficiently (e.g., has only two small nonzero coefficients) and (2) the MRG that corresponds to the combination has a good lattice structure. If each is prime and if each component  has maximal period length , then each is even and cannot exceed . Tables of good parameters for combined MRGs of different sizes that reach this upper bound are given in [41] and [62], together with C implementations.

The recurrence (2.2) can be written in matrix form as

To jump ahead directly from to , for an arbitrary integer , it suffices to exploit the relationship

If this is to be done several times for the same , the matrix can be precomputed once for all. For a large , this can be done in matrix multiplications via a standard divide-and-conquer algorithm ([31]):

## 2.3.6 Linear Recurrences With Carry

These types of recurrences were introduced by [73] to obtain a large period even when is a power of two (in which case the implementation may be faster). They were studied and generalized by [90,9,11], and [24]. The basic idea is to add a carry to the linear recurrence (2.2). The general form of this RNG, called multiply-with-carry (MWC), can be written as

 (2.9) (2.10) (2.11)

where is a positive integer (e.g., a power of two), are arbitrary integers such that is relatively prime to , and is the multiplicative inverse of modulo . The state at step is . In practice, the sum in (2.11) is truncated to a few terms (it could be a single term if is large), but the theoretical analysis is much easier for the infinite sum.

Define and let be the inverse of in arithmetic modulo , assuming for now that . A major result proved in [90,11], and [24] is that if the initial states agree, the output sequence is exactly the same as that produced by the LCG with modulus  and multiplier . Therefore, the MWC can be seen as a clever way of implementing a LCG with very large modulus. It has been shown by [11] that the value of for this LCG satisfies for , which means that the lattice structure will be bad unless the sum of squares of coefficients is large.

In the original proposals of [73], called add-with-carry and subtract-with-borrow, one has for some and the other coefficients are zero, so for and the generator has essentially the same structural defect as the additive lagged-Fibonacci generator. In the version studied by [11], it was assumed that . Then, the period length cannot exceed if is a power of two. A concrete implementation was given in that paper. [24] pointed out that the maximal period length of can be achieved by allowing a more general . They provided specific parameters that give a maximal period for ranging from to and up to approximately .

Next: 2.4 Generators Based on Up: 2. Random Number Generation Previous: 2.2 Uniform Random Number