[Python-Dev] a note in random.shuffle.doc ... (original) (raw)

Robert Kern robert.kern at gmail.com
Sat Jun 10 22:56:08 CEST 2006


Alex Martelli wrote:

...claims:

Note that for even rather small len(x), the total number of permutations of x is larger than the period of most random number generators; this implies that "most" permutations of a long sequence can never be generated. Now -- why would the behavior of "most" random number generators be relevant here? The module's docs claim, for its specific Mersenne Twister generator, a period of 2**19997-1, which is (e.g.) a comfortable 130128673800676351960752618754658780303412233749552410245124492452914636 028095467780746435724876612802011164778042889281426609505759158196749438 742986040468247017174321241233929215223326801091468184945617565998894057 859403269022650639413550466514556014961826309062543 times larger than the number of permutations of 2000 items, which doesn't really feel to me like a "rather small len(x)" in this context (what I'm most often shuffling is just a pack of cards -- len(x)==52 -- for example).

I wouldn't be too comfortable with that margin. The fun thing about factorials is that they add up really quickly. The crossover point is between 2080 and 2081.

In [28]: from scipy import *

In [29]: def f(x): ....: return special.gammaln(x-1) - 19937*log(2) ....:

In [30]: optimize.brentq(f, 2000, 3000) Out[30]: 2082.4031300820125

In [31]: import gmpy

In [32]: mtperiod = 2**19937 - 1

In [33]: for i in range(2075, 2085): ....: if gmpy.fac(i) > mtperiod: ....: print i ....: break ....: ....: 2081

I think that documenting this boundary might be worthwhile rather than making vague references to "small len(x)." A note along the lines of Josiah wrote about there being no guarantees despite period size would also be useful.

OTOH, isn't the exact PRNG algorithm considered an implementation detail? It certainly was when the module migrated from Wichmann-Hill to the Mersenne Twister. OTGH, I don't foresee the random module ever using an algorithm with worse characteristics than the Mersenne Twister.

-- Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco



More information about the Python-Dev mailing list