RNG Implementations: A Literature Survey (original) (raw)
Research Comments fromCiphers By Ritter
Terry Ritter
Some of the well-known fully-worked-out RNG proposals with an obvious or explicit computer implementation.
Contents
- 1979
- Bright and Enison describe their implementation of the Lewis and Payne GFSR, with sequence interruption for cryptographic use, and some of the tests used to evaluate the sequence.
- Schrage describes how to evaluate a linear multiplicative generator in pieces, so that the computation range is not exceeded.
- 1982
- Wichmann and Hill present one of the first portable designs (in FORTRAN). Based internally on three "byte-size" linear congruential generators, the three intermediate values never exceed 16-bit range.
- 1988
- L'Ecuyer gives some new "Efficient and Portable" (and tested) RNG designs. These are "combined multiplicative linear congruential" designs for 16- and 32-bit wordlengths, in easy Pascal.
- The Park and Miller design "Minimal Standard" RNG's. These are simple linear congruential designs for short wordlengths, also in easy Pascal.
- 1989
- Press and Teukolsky of_Numerical Recipes_ fame present a _quasi_random sequence for improved Monte Carlo integration.
- 1990
- Macomber and White provide a fully worked-out GFSR design (implemented in BASIC). Initialization will take a while.
- Fushimi design a GFSR RNG, implement it in portable C, and test it. Again, initialization is a problem.
- Ripley treats us to an insider's view of RNG's for simulation. There is also a new, faster (though still slow) approach to GFSR initialization.
- 1991
- Tezuka and L'Ecuyer design an RNG (a combined pair of ordinary shift-register RNG's), implement it in portable C, and test it.
- 1992
- Collins et. al. give a simple, mainly pedagogic, real-number RNG based on the logistic equation. This requires a simple transformation to produce the uniform distribution, and a different simple transformation gives a distribution which is very close to Gaussian.
- Press and Teukolsky present the new RNG's from Numerical Recipes new edition. These are basically thePark-Miller "Minimal Standard" andL'Ecuyer "Efficient and Portable" designs, with added shuffling.
- 1993
- Only five years later,Marsaglia responds to thePark-Miller "Minimal Standard" designs.
- 1994
- Marsaglia and Zaman critiquethe new Numerical Recipes generators (Park-Miller "Minimal Standard" and L'Ecuyer "Efficient and Portable"). They particularly discuss ran2, and present mzran() (in Fortran) and mzran13() (in C) as alternatives.
1979 -- Bright and Enison
Bright, H. and R. Enison. 1979. Quasi-Random Number Sequences from a Long-Period TLP Generator with Remarks on Application to Cryptography.Computing Surveys. 11(4): 357-370.
"The background of the Lehmer linear congruential algorithms that are in almost universal use for random sequence generation is reviewed; the reasoning is given for choosing the somewhat different Tausworthe linear recurrence (mod 2) type, using the bit selection algorithm of Lewis and Payne; and the initial statistical evaluation is reported on. A review is given of the intuitive process by which the Tausworthe approach, with a generating primitive trinomial of Mersenne prime exponent degree, was chosen."
"We have developed a generator which uses the basic Tausworthe algorithm. . . ." "Lewis-Payne bit selection appears to be superior to decimation: To generate _r_-bit numbers, each of the r bits is chosen from a different part of the same Tausworthe sequence with a constant gap between bits."
"Our TLP generator uses the trinomial _x_521 +_x_32 + 1 to generate 64-bit numbers that are 8-distributed and have good k_-distributivity for_k > 8. The period of the sequence is 2521 - 1. The LP gap is 52,100."
"Statistical Testing Criteria"
- "Equidistribution or Frequency Test. One counts the number of times a member of the sequence falls into a given interval. The number should be approximately the same for intervals of the same length if the sequence is uniform."
- "Serial Test. This is a higher dimensional version of the equidistribution test. One counts the number of times a_k_-tuple . . . of members of the sequence falls into a given _k_-dimensional cell. If this test is passed, the sequence is said to be _k_-distributed." "Other tests of _k_-distributivity are
- "Poker test: One considers groups of k members of the sequence and counts the number of distinct values represented in each group."
- "Maximum (minimum) of k: One plots the distribution of the function max (min) [of a _k_-tuple]."
- "Sum of k"
- "Gap Test. One plots the distribution of gaps in the sequence of various lengths. . . ."
- "Runs Test. One plots the distribution of maximal ascending (descending) runs of various lengths."
- "Coupon Collector's Test. One chooses a suitably small integer d and divides the universe into d intervals. Then each member of the sequence falls into one such interval. One plots the distribution of runs of various lengths required to have all d intervals represented."
- "Permutation Test. One studies the order relations between the members of the sequence in groups of k. Each of the k! possible orders should occur about equally often. If the universe is large, the probability of equality is small; otherwise, equal members may be disregarded."
- "Serial Correlation Test. One computes the correlation coefficient between consecutive members of the sequence. This gives the serial correlation for lag 1. Similarly, one may get the serial correlation for lag k by computing the correlation coefficient between xi and_xi+k._ This is to show that the members of the sequence are independent."
"Other tests have been proposed and used. Lewis and Payne [Lewi73b] ran a conditional bit test, which tested the independence of each bit in a string from the others, as well as a Fourier transform test."
1979 -- Schrage
Schrage, L. 1979. A More Portable Fortran Random Number Generator.ACM Transactions on Mathematical Software. 5(2): 132-138.
"A Fortran implementation of a random number generator is described which produces a sequence of random integers that is machine independent as long as the machine can represent all integers in the interval [-2**31 + 1, 2**31 - 1]."
The innovation here is not the RNG, but the method of evaluation, which guarantees that the intermediate values of the generator remain in range.
1982 -- Wichmann and Hill
Wichmann, B. and I. Hill. 1982. Algorithm AS 183. An Efficient and Portable Pseudo-random Number Generator.Applied Statistics. 31: 188-190.
"We claim [our algorithm] is reasonably short, reasonably fast, machine-independent, easily programmed in any language, and statistically sound. It has a cycle length exceeding [6.95 x 1012]."
"The algorithm produces numbers rectangularly distributed between 0 and 1, excluding endpoints." (Actually, zero can occur due to rounding error, but a simple test suffices to tame the situation.)
1988 -- L'Ecuyer
L'Ecuyer, P. 1988. Efficient and Portable Combined Random Number Generators.Communications of the ACM. 31(6): 742-749, 774.
"ABSTRACT: In this paper we present an efficient way to combine two or more Multiplicative Linear Congruential Generators (MLCGs) and propose several new generators. The individual MLCGs, making up the proposed combined generators, satisfy stringent theoretical criteria for the quality of the sequence they produce (based on the Spectral Test) and are easy to implement in a portable way."
Internally, we have two smaller MLCG's each of which can be implemented easily on 32-bit machines. The internal generators are combined by subtraction to a positive value in the range of one of the generators, then converted to reals by multiplication by 2-31.
"For 32-bit computers, we suggest l=2, m1 = 2147483563, a1 = 40014, m2 = 2147483399 and a2 = 40692." ". . . the combined generator has period . . . 2.30584 x 1018." "It works as long as the machine can represent all integers in the range [-231+85, 231-85]." "Notice that the function will never return 0.0 or 1.0, as long as REAL variables have at least 23-bit mantissa . . . ."
"We performed 21 different tests . . . ."
- Equidistribution test, using chi-square, d=64, n=1000, N=10000.
- Equidistribution test, using chi-square, d=256, n=10000, N=10000.
- Serial test with pairs (2-dimensional), d=64, n=100000, N=10000.
- Serial test with triplets (3-dimensional), d=16, n=100000, N=1000.
- Serial test with quadruplets (4-dimensional), d=8, n=100000, N=1000.
- Gap test, a=0, b=.05, t=15, n=10000, N=1000.
- Gap test, a=.95, b=1, t=15, n=10000, N=1000.
- Gap test, a=1/3, b=2/3, t=10, n=10000, N=1000.
- Poker test, k=4, d=4, n=10000, N=1000.
- Poker test, k=6, d=4, n=10000, N=1000.
- Poker test, k=6, d=8, n=10000, N=1000.
- Poker test, k=8, d=16, n=10000, N=1000.
- Coupon's collector test, d=5, t=25, n=10000, N=1000.
- Coupon's collector test, d=10, t=40, n=10000, N=1000.
- Permutation test, t=3, n=10000, N=1000.
- Permutation test, t=5, n=10000, N=1000.
- Runs-up test, n=100000, N=1000.
- Maximum-of-t test, t=8, d=128, n=10000, N=1000.
- Collision test, 6 dimensions, d=8, n=20000, N=100.
- Collision test, 10 dimensions, d=4, n=20000, N=100.
- Collision test, 20 dimensions, d=2, n=20000, N=100.
1988 -- Park and Miller
Park, S. and K. Miller. 1988. Random Number Generators: Good Ones are Hard to Find.Communications of the ACM. 31(10): 1192-1201.
The so-called "Minimal Standard" RNG turns out to be a fairly important reference point, if not an actual standard.
". . . the widespread adoption of good, portable, industry standard software for random number generation has proven to be an elusive goal. Many generators have been written, most of them have demonstrably non-random characteristics, and some are embarrassingly bad."
"To the non-specialist, the construction of a random number generator may appear to be the kind of thing that any good programmer can do easily. Over the years many programmers have unwittingly demonstrated that it is all too easy to 'hack' a procedure that will produce a strange looking, apparently unpredictable sequence of numbers. It is fundamentally more difficult, however, to write quality software which produces what is really desired -- a virtually infinite sequence of_statistically independent_ random numbers,uniformly distributed between 0 and 1. This is a key point: strange and unpredictable is not necessarily random."
The selected design is a multiplicative linear congruential generator (MLCG):
z i+1 = 16807_zi_ mod 231 - 1.
There are two INTEGER and two REAL implementations in easy Pascal. The first two require 46-bit INTEGERS or REALs. Schrange's method is used in the second pair of designs to keep intermediate values within the 32-bit range.
1989 -- Press and Teukolsky
Press, W. and S. Teukolsky. 1989. Quasi- (that is, Sub-) Random Numbers.Computers in Physics. Nov/Dec. 76-79.
"Sometimes we want to pick sample points in an_n_-dimensional space that are spread out as uniformly as possible. The most obvious case is where we want to integrate a function over _n_-dimensional space by a Monte Carlo method."
"In the familiar case of uncorrelated points, e.g., those generated by a good pseudorandom number generator, the fractional error of the estimated integral will decrease by the equally familiar 1 / SQRT(N) law."
"A simple counterexample is to choose sample points that lie on a Cartesian grid, and to sample each grid point exactly once . . . . The Monte Carlo method thus becomes a deterministic quadrature scheme . . . whose fractional error decreases at least as fast as 1 / N . . . ."
"Sequences of n_-tuples that fill n space more uniformly than uncorrelated random points are called_quasirandom sequences."
Actual FORTRAN code is given for such a generator.
1990 -- Macomber and White
Macomber, J. and C. White. 1990. An _n_-Dimensional Uniform Random Number Generator Suitable for IBM-Compatible Microcomputers.Interfaces. 20(3): 49-59.
"[There is] a basic validity problem with mixed congruential RNG's in multidimensional situations [Marsaglia 1968]. Increasing the dimensionality of a hypercube tends to increase the lattice effect. Therefore, what a user believes to be a reliable RNG can become entirely predictable or obviously nonrandom in_n_-dimensional applications."
"An RNG that overcomes many of the shortcomings of congruential generators is the generalized feedback shift register generator, of GFSR generator [Kennedy 1975; Lewis 1975]."
"The GFSR RNG requires a starting array of randomized binary digits in order to work. To generate these starting numbers, a setup program is required to initialize the generator."
"With the incorporation of Tezuka's test [1987b] within this environment, global _n_-space uniformity can be guaranteed."
1990 -- Fushimi
Fushimi, M. 1990. Random number generation with the recursion
Xt = Xt-3p xor_Xt-3q._ Journal of Computational and Applied Mathematics. 31: 195-118. North-Holland.
"Abstract: A generalized feedback shift register (GFSR) algorithm proposed by Lewis and Payne (1973) uses a primitive trinomial to generate a sequence of pseudorandom numbers. We propose a similar algorithm which uses a primitive polynomial with many non-zero terms, but generates a number as fast as the original GFSR algorithm. Our sequence is guaranteed to be equidistributed in higher dimensions and to have a good autocorrelation property."
"Marsaglia [17] pointed out about 20 years ago that the points in the unit _n_-cube generated by the Lehmer's linear congruential method lie in a relatively small number of parallel hyperplanes. All sequences generated with recursions of low degree over the finite field have the flaw of sparseness in_n_-space . . . ."
"We have developed a new method of generating pseudorandom numbers by linear recurrence modulo two. The method uses the recurrence formula
Xt = Xt-3p xor_Xt-3q_
instead of the formula
Xt = Xt-p xor_Xt-q_
proposed by Lewis and Payne, where 1 + zq + zp is a primitive polynomial of degree p over the Galois field GF(2) . . . ." "The crucial point in using this generator . . . is the initialization procedure."
"Using the primitive trinomial 1 + z32 + z521, we have performed extensive statistical tests on number sequences generated by our method."
1990 -- Ripley
Ripley, B. 1990. Thoughts on pseudorandom number generators.Journal of Computational and Applied Mathematics. 31: 153-163. North-Holland.
"Abstract: Much of the informal discussion at the Workshop concerned the merits of different pseudorandom number generators. Here we record some comments based on comparing generators across a wide range of machines."
"The idea properties of a good general-purpose pseudorandom number generator are easy to agree but impossible to achieve simultaneously. They include
- a very good approximation to uniform distribution,
- a very close to independent output in a moderate number of dimensions,
- repeatability from a simply specified starting point,
- speed,
- a very long period (at least 250)."
"It is by now known that well-designed congruential generators have a _k_-dimensional output which fills a lattice in [0,1)k. All of these examples have adequate lattice structure in up to 8 dimensions [13, Sect. 2.4] and so satisfy the first two requirements, at least weakly. It hardly seems worth searching for 'optimal' multipliers, since most choices are reasonably good. (I find the criteria of Fishman and Moore [3] unreasonably stringent. Generators whose criteria differ by a factor of two are for practical purposes indistinguishable.)"
"Some of these generators (notably Marsg) have a poor two-dimensional discrepancy." "I believe that discrepancy is not a good measure of a _pseudo_random number generator, and may not be a good measure for a _quasi_random sequence . . . ."
"The alternative family is that of _shift-register_generators . . . ." "The snag with GFSRs is their initialization." "FORTRAN code is given in the Appendix."
"The reader may already have guessed which generators I actually use. For all but the most demanding problems Marsgsuffices. It is simple to program in assembler . . . and easy to check against a double-precision implementation. If any doubt arises from the results of the simulation experiment I useGFSR521 as a cross-check, but have never had cause to doubt Marsg. However, the period ofMarsg is beginning to seem too small, and GFSRs have the edge for the future . . . ."
1990 -- Marsaglia and Zaman
Marsaglia, G. and A. Zaman. 1990. Toward a Universal Random Number Generator.Statistics and Probability Letters. 8: 35-39. North-Holland.
"Abstract: This article describes an approach toward a random number generator that passes all the stringent tests for randomness we have put to it, and that is able to reproduce exactly the same sequence of uniform random variables in a wide variety of computers, including TRS80, Apple, Macintosh, Commodore, Kaypro, IBM PC, AT, PC and AT clones, Sun, Vax, IBM 360/370, 3090, Amdahl, CDC Cyber and even 205 and ETA supercomputers."
This is a combined RNG. One part is a 97-element "lagged-Fibonacci" generator using multiplication. The other part is a linear congruential generator with modulus 169. This is a little more complex, and the write-up a little less clear, than other proposed "standard" designs. Includes Fortran code.
1991 -- Tezuka and L'Ecuyer
Tezuka, S. and P. L'Ecuyer. 1991. Efficient and Portable Combined Tausworthe Random Number Generators.ACM Transactions on Modeling and Computer Simulation. 1(2): 99-112.
"In this paper, we propose three combined Tausworthe random number generators with period length about 1018, whose _k_-distribution properties are good and which can be implemented in a portable way. These generators are found through an exhaustive search for the combination with the best lattice structure in GF{2,x}k, the_k_-dimensional vector space over the field of all Laurent series with coefficients in GF(s). We then apply a battery of statistical tests to these generators for the comprehensive investigation of their empirical statistical properties. No apparent defect was found. In the appendix, we give a sample program in C for the generators."
1992 -- Collins, et. al.
Collins, J., et. al. 1992. A random number generator based on the logit transform of the logistic variable.Computers in Physics. 6(6): 630-632.
[Note that the term "nonperiodic" in the following paragraph refers to the theoretical situation of mathematical infinite-precision real numbers. In the context of computer floating-point with a limited-precision, no iterative RNG can possibly be "nonperiodic." In fact, we now have the worrisome possibility of multiple cycles of various lengths.]
"A nonperiodic random number generator, which is based on the logistic equation, is presented. A simple transformation that operates on the logistic variable and leads to a sequence of random numbers with a near-Gaussian distribution, is described and discussed. The associated algorithm can be easily utilized in laboratory exercises, classroom demonstrations, and software written for stochastic modeling purposes."
1992 -- Press and Teukolsky
Press, W. and S. Teukolsky. 1992. Portable Random Number Generators.Computers in Physics. 6(5): 522-524.
"Publication of [our] new editions gives us a unique opportunity for confession: There are several sections in the old books that are embarrassingly bad." "Worst among them, we think, is the section on generators of uniform random numbers . . . [we gave] two generators (RAN1 and RAN2) that are at best only mediocre."
"Our new editions, and this column, provide three much better exemplars, named ran0, ran1, and ran2. (Routines in the new editions are named in lower case, in part to distinguish them from old routines.)"
ran0 is thePark-Miller "Minimal Standard" design. ran1 is the same generator with Bays-Durham shuffling. ran2 is theL'Ecuyer "Efficient and Portable"combined design, with added shuffling. A FORTRAN listing for each is given.
1993 -- Marsaglia
Marsaglia, G. 1993. Remarks on Choosing and Implementing Random Number Generators.Communications of the ACM. 36(7): 105-108.
(Comments on Park-Miller "Minimal Standard" designs.)
- "The congruential generator xn = 16807_x_ _n_-1 mod 231 - 1 is a good generator, but not a great generator."
- "The redeeming feature of the generator is the closeness of its modulus to a power of 2, providing nice, but tricky, machine language implementations. But for such implementations, 232 - 2 or 232 - 5 seem better choices." "Summary. Generators using modulus 231 - 1 are attractive only if they are implemented in machine language, but for such implementations the modulus 232 - 2 is preferable
1994 -- Marsaglia and Zaman
Marsaglia, G. and A. Zaman. 1994. Some portable very-long period random number generators.Computers in Physics. 8(1): 117-121.
"It is found that a proposed random number generator ran2, recently presented in the Numerical Recipes column . . . is a good one, but a number of generators are presented that are at least as good and are simpler, much faster, and with periods "billions and billions" of times longer. They are presented not necessarily to supplant ran2, but to put it in perspective. Any serious user of Monte Carlo methods should have a variety of random number generators from which to choose. In addition to two specific programs, one in Fortran and one in C, a framework is offered within which the readers can easily fashion their own generators with periods ranging from 1027 - 10101."
Terry Ritter, hiscurrent address, and his top page.
Last updated: 2002-01-17