msg99078 - (view) |
Author: Terrence Cole (terrence) |
Date: 2010-02-09 00:15 |
This code: >>> random.seed(b'foo') >>> random.getrandbits(8) ...repeated 7 more times... Yields the sequence of values: amd64: 227, 199, 34, 218, 83, 115, 236, 254 x86: 245, 198, 204, 66, 219, 4, 168, 93 Comments in the source seem to indicate random should produce the same results on all platforms. I first thought that the seed was not resetting the state correctly, however, if I do a 'random.setstate( (3,(0,)*625,None) )' before seeding the generator, the results do not change from what is given above. Also, calls to getrandbits after the setstate, but before another seed, correctly return 0. It seems from this that seed is resetting the state properly, but some of the internals are not 32bit/64bit consistent. |
|
|
msg99085 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2010-02-09 03:17 |
Would you like to work on a patch? |
|
|
msg99087 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2010-02-09 06:52 |
Hmm, this may be difficult to fix without breaking somebody's expectation of repeating sequences they've already generated. The code is in random_getrandbits(): http://svn.python.org/view/python/trunk/Modules/_randommodule.c?revision=72344&view=markup |
|
|
msg99105 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-02-09 11:16 |
It's not only getrandbits(): ** x86 ** >>> random.seed(b'foo') >>> random.random() 0.95824312997798622 ** x86_64 ** >>> random.seed(b'foo') >>> random.random() 0.88694660946819537 |
|
|
msg99106 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-02-09 11:18 |
It works when seeding from a single integer, though: >>> import random >>> random.seed(123) >>> random.random() 0.052363598850944326 So I guess it's the seeding-from-an-array which is buggy. |
|
|
msg99107 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-02-09 11:25 |
Ok, it's simple really. When seeding from something else than an integer, seed() takes the hash of the object (instead of considering all its bytes, which might be considered a weakness since you lose entropy -- also, Python hash() is not supposed to be cryptographically strong). The hash is different in 32-bit and 64-bit mode (although the lower 32 bits are the same, at least for a bytes object), and since all the bits are taken into account the initial state is different. So the easy workaround for the OP is to seed with an integer rather a bytes object. |
|
|
msg99109 - (view) |
Author: Michael Foord (michael.foord) *  |
Date: 2010-02-09 11:31 |
If we aren't going to fix it, should we document the limitation? |
|
|
msg99110 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-02-09 11:43 |
Well, ideally we should drop the automatic hash() and only accept: 1) ints/longs 2) buffer-like objects (and tell people to hash() explicitly if they want to) If that's too disruptive, we should document it. And, for 3.x, provide the following recipe to hash from a bytes object without losing entropy, and keeping the same results under 32-bit and 64-bit builds: >>> import random >>> random.seed(int.from_bytes(b'foo', 'little')) >>> random.random() 0.08384169414918807 |
|
|
msg99111 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2010-02-09 12:02 |
[Antoine] > >>> random.seed(int.from_bytes(b'foo', 'little')) +1 for either documenting this useful trick, or modifying init_by_array to do this automatically for buffer-like objects. Disallowing other types of input for the seed sounds dangerous, though. |
|
|
msg99123 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2010-02-09 16:40 |
I will update the documentation. |
|
|
msg99157 - (view) |
Author: Terrence Cole (terrence) |
Date: 2010-02-10 07:53 |
Thank you for all the help! I'm glad to see that the use of hash() on buffer compatible objects is an actual gotcha and not just me being obtuse. Also, for those googling who, like me, won't be able to use 3.2's from_bytes until 3.2 is released in December, here is code to convert a bytes to an int: >>> n = 0 >>> off = 0 >>> data = b'\xfc\x00' >>> for c in data[::-1]: ... n += c << off ... off += 8 ... >>> print(n) 64512 Or, if you prefer the functional style: >>> sum([c<<off for c,off in zip(data[::-1],range(0,len(data)*8+1,8))]) 64512 |
|
|
msg115740 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2010-09-07 04:51 |
Fixed in r84574 and r84576. The seed function no longer uses hash() for str, bytes, or bytearray arguments. |
|
|