Issue 18928: Remove misleading documentation for random.shuffle (original) (raw)

Created on 2013-09-05 00:18 by dbenbenn, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
mywork.patch	dbenbenn,2013-09-05 00:18	Remove incorrect documentation comment	review

Messages (5)
msg196971 - (view)	Author: David Benbennick (dbenbenn)	Date: 2013-09-05 00:18
Since Python 2.1 [1], when random.shuffle was added, the documentation has said: """Note that for even rather small len(x), the total number of permutations of x is larger than the period of most random number generators; this implies that most permutations of a long sequence can never be generated.""" This comment is incorrect and misleading. In fact, I claim that shuffle can produce "all" permutations for any representable sequence. To shuffle a sequence of length N requires log(N!) ~ N * log(N/e) bits of randomness [2]. The random module provides a generator with "a period of 219937-1", meaning you can get 219937 bits of randomness out of it before it starts repeating. All of which is to say that any representable sequence, say N < 250, will need no more than 260 bits of randomness to shuffle. That is well within the period of the random number generator. Attached is a patch that deletes the comment. An illustration of this misconception is at [3]. 1: http://docs.python.org/release/2.1/lib/module-random.html 2: https://en.wikipedia.org/wiki/Factorial#Rate_of_growth_and_approximations_for_large_n 3: http://stackoverflow.com/questions/3062741/maximal-length-of-list-to-shuffle-with-python-random-shuffle
msg196974 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2013-09-05 03:11
It seems to me that 2080 (per the accepted answer to your [3]) is indeed "a rather small len(x)", and that the docs are correct as written. I wonder if it would be worth adding a footnote that explains how to calculate that example 2080 number from the documented period of the random number generator, to make the concepts involved a little bit clearer?
msg196975 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2013-09-05 03:20
Alternatively, you would have to supply (or supply a pointer to) a mathematical proof of your thesis.
msg196978 - (view)	Author: Tim Peters (tim.peters) *	Date: 2013-09-05 04:20
When the comment was introduced, Python's Wichmann-Hill generator had a much shorter period, and we couldn't even generate all the permutations of a deck of cards. The period is astronomically larger now, but the stackoverflow answer (2080) is correct for the current upper bound. The docs could be beefed up to say more about that - but they'd get awfully tedious ;-) To the OP, this is your error: it takes lg(N!) "bits of randomness" (lg is the logarithm to base 2) to generate _one_ random permutation. That says nothing about what's needed to generate _all_ permutations. With an RNG of period P, there are P possible starting states. The starting state wholly determines the permutation produced. Therefore an RNG of period P cannot generate more than P distinct permutations (and that's an absolute upper bound - it may not be achieved). That's why comparing N! to P is the correct computation here, and indeed: >>> math.factorial(2080) > 219937 - 1 False >>> math.factorial(2081) > 219937 - 1 True If you still don't believe it, here's a challenge: take a toy RNG with a small period, and use it to generate permutations (via any method you like). You'll find that you never get more distinct permutations than the period of your RNG. Then reread the above ;-)
msg196979 - (view)	Author: David Benbennick (dbenbenn)	Date: 2013-09-05 05:14
Okay, I see what the comment is saying now. I was mistaken. It might make the statement clearer if it is made more precise. "rather small" is vague. That vagueness is intentional, but it makes it more confusing.

History
Date	User	Action	Args
2022-04-11 14:57:50	admin	set	github: 63128
2013-09-05 05:40:10	tim.peters	set	resolution: not a bugstage: resolved
2013-09-05 05:14:16	dbenbenn	set	status: open -> closedmessages: +
2013-09-05 04:20:34	tim.peters	set	nosy: + tim.petersmessages: +
2013-09-05 03:20:43	r.david.murray	set	messages: +
2013-09-05 03:11:00	r.david.murray	set	nosy: + rhettinger, r.david.murraymessages: +
2013-09-05 00🔞34	dbenbenn	create