Issue 7889: random produces different output on different architectures (original) (raw)

Created on 2010-02-09 00:15 by terrence, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (12)
msg99078 - (view)	Author: Terrence Cole (terrence)	Date: 2010-02-09 00:15
This code: >>> random.seed(b'foo') >>> random.getrandbits(8) ...repeated 7 more times... Yields the sequence of values: amd64: 227, 199, 34, 218, 83, 115, 236, 254 x86: 245, 198, 204, 66, 219, 4, 168, 93 Comments in the source seem to indicate random should produce the same results on all platforms. I first thought that the seed was not resetting the state correctly, however, if I do a 'random.setstate( (3,(0,)*625,None) )' before seeding the generator, the results do not change from what is given above. Also, calls to getrandbits after the setstate, but before another seed, correctly return 0. It seems from this that seed is resetting the state properly, but some of the internals are not 32bit/64bit consistent.
msg99085 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2010-02-09 03:17
Would you like to work on a patch?
msg99087 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2010-02-09 06:52
Hmm, this may be difficult to fix without breaking somebody's expectation of repeating sequences they've already generated. The code is in random_getrandbits(): http://svn.python.org/view/python/trunk/Modules/_randommodule.c?revision=72344&view=markup
msg99105 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2010-02-09 11:16
It's not only getrandbits(): x86 >>> random.seed(b'foo') >>> random.random() 0.95824312997798622 x86_64 >>> random.seed(b'foo') >>> random.random() 0.88694660946819537
msg99106 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2010-02-09 11:18
It works when seeding from a single integer, though: >>> import random >>> random.seed(123) >>> random.random() 0.052363598850944326 So I guess it's the seeding-from-an-array which is buggy.
msg99107 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2010-02-09 11:25
Ok, it's simple really. When seeding from something else than an integer, seed() takes the hash of the object (instead of considering all its bytes, which might be considered a weakness since you lose entropy -- also, Python hash() is not supposed to be cryptographically strong). The hash is different in 32-bit and 64-bit mode (although the lower 32 bits are the same, at least for a bytes object), and since all the bits are taken into account the initial state is different. So the easy workaround for the OP is to seed with an integer rather a bytes object.
msg99109 - (view)	Author: Michael Foord (michael.foord) *	Date: 2010-02-09 11:31
If we aren't going to fix it, should we document the limitation?
msg99110 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2010-02-09 11:43
Well, ideally we should drop the automatic hash() and only accept: 1) ints/longs 2) buffer-like objects (and tell people to hash() explicitly if they want to) If that's too disruptive, we should document it. And, for 3.x, provide the following recipe to hash from a bytes object without losing entropy, and keeping the same results under 32-bit and 64-bit builds: >>> import random >>> random.seed(int.from_bytes(b'foo', 'little')) >>> random.random() 0.08384169414918807
msg99111 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2010-02-09 12:02
[Antoine] > >>> random.seed(int.from_bytes(b'foo', 'little')) +1 for either documenting this useful trick, or modifying init_by_array to do this automatically for buffer-like objects. Disallowing other types of input for the seed sounds dangerous, though.
msg99123 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2010-02-09 16:40
I will update the documentation.
msg99157 - (view)	Author: Terrence Cole (terrence)	Date: 2010-02-10 07:53
Thank you for all the help! I'm glad to see that the use of hash() on buffer compatible objects is an actual gotcha and not just me being obtuse. Also, for those googling who, like me, won't be able to use 3.2's from_bytes until 3.2 is released in December, here is code to convert a bytes to an int: >>> n = 0 >>> off = 0 >>> data = b'\xfc\x00' >>> for c in data[::-1]: ... n += c << off ... off += 8 ... >>> print(n) 64512 Or, if you prefer the functional style: >>> sum([c<<off for c,off in zip(data[::-1],range(0,len(data)*8+1,8))]) 64512
msg115740 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2010-09-07 04:51
Fixed in r84574 and r84576. The seed function no longer uses hash() for str, bytes, or bytearray arguments.

History
Date	User	Action	Args
2022-04-11 14:56:57	admin	set	github: 52137
2010-09-07 04:51:36	rhettinger	set	status: open -> closedresolution: fixedmessages: +
2010-02-10 07:53:18	terrence	set	messages: +
2010-02-09 16:40:09	rhettinger	set	nosy:loewis, rhettinger, mark.dickinson, pitrou, michael.foord, terrencemessages: + components: + Documentation, - Library (Lib)versions: - Python 2.6, Python 2.7, Python 3.2
2010-02-09 12:02:32	mark.dickinson	set	messages: +
2010-02-09 11:43:49	pitrou	set	messages: + versions: + Python 2.6, Python 2.7, Python 3.2
2010-02-09 11:31:54	michael.foord	set	nosy: + michael.foordmessages: +
2010-02-09 11:25:23	pitrou	set	messages: +
2010-02-09 11🔞46	pitrou	set	messages: +
2010-02-09 11:16:39	pitrou	set	nosy: + pitroumessages: +
2010-02-09 10:57:38	mark.dickinson	set	nosy: + mark.dickinson
2010-02-09 06:52:12	rhettinger	set	assignee: rhettingermessages: + nosy: + rhettinger
2010-02-09 03:17:13	loewis	set	nosy: + loewismessages: +
2010-02-09 00:15:51	terrence	create