[Python-Dev] Re: [Python-checkins] python/dist/src/Lib random.py, 1.62, 1.63 (original) (raw)
Tim Peters tim.peters at gmail.com
Mon Aug 30 21:45:17 CEST 2004
- Previous message: [Python-Dev] Volunteer for next bug day?
- Next message: [Python-Dev] unicode and __str__
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[rhettinger at users.sourceforge.net]
Modified Files: random.py Log Message: Teach the random module about os.urandom(). ... * Provide an alternate generator based on it. ... + tofloat = 2.0 ** (-7*8) # converts 7 byte integers to floats ... +class HardwareRandom(Random): ... + def random(self): ... + return long(hexlify(urandom(7)), 16) * tofloat
Feeding in more bits than actually fit in a float leads to bias due to rounding. Here:
""" import random import math import sys
def main(n, useHR):
from math import ldexp
if useHR:
get = random.HardwareRandom().random
else:
get = random.random
counts = [0, 0]
for i in xrange(n):
x = long(ldexp(get(), 53)) & 1
counts[x] += 1
print counts
expected = n / 2.0
chisq = (counts[0] - expected)**2 / expected +
(counts[1] - expected)**2 / expected
print "chi square statistic, 1 df, =", chisq
n, useNR = map(int, sys.argv[1:]) main(n, useNR) """
Running with the Mersenne random gives comfortable chi-squared values for the distribution of bit 2**-53:
C:\Code\python\PCbuild>python temp.py 100000 0 [50082, 49918] chi square statistic, 1 df, = 0.26896
C:\Code\python\PCbuild>python temp.py 100000 0 [49913, 50087] chi square statistic, 1 df, = 0.30276
C:\Code\python\PCbuild>python temp.py 100000 0 [50254, 49746] chi square statistic, 1 df, = 2.58064
Running with HardwareRandom instead gives astronomically unlikely values:
C:\Code\python\PCbuild>python temp.py 100000 1 [52994, 47006] chi square statistic, 1 df, = 358.56144
C:\Code\python\PCbuild>python temp.py 100000 1 [53097, 46903] chi square statistic, 1 df, = 383.65636
C:\Code\python\PCbuild>python temp.py 100000 1 [53118, 46882] chi square statistic, 1 df, = 388.87696
One way to repair that is to replace the computation with
return _ldexp(long(_hexlify(_urandom(7)), 16) >> 3, -BPF)
where _ldexp is math.ldexp (and BPF is already a module constant).
Of course that would also be biased on a box where C double had fewer than BPF (53) bits of precision (but the Twister implementation would show the same bias then).
- Previous message: [Python-Dev] Volunteer for next bug day?
- Next message: [Python-Dev] unicode and __str__
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]