Issue 7522: random.choice should accept a set as input (original) (raw)

Created on 2009-12-16 07:52 by lleeoo, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (9)
msg96480 - (view) Author: Leo (lleeoo) Date: 2009-12-16 07:52
The following code should just work: import random random.choice(set(range(5))) instead the output is TypeError: TypeError: 'set' object does not support indexing The algorithm in random.choice requires a sequence, but the semantics of choice do not, and should not, require a sequence. The code should be changed to convert the input to a sequence instead. Cheers, Leo.
msg96486 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-12-16 12:50
This would be a new feature, so would have to wait for Python 2.7 / 3.2 (2.6 is only receiving bugfixes).
msg96492 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-12-16 17:57
The underlying data structure for sets doesn't lend itself to an efficient method of random selection. It is best for the programmer to explictly convert to a sequence and then make the random selection (that way the conversion cost isn't hidden). If you don't mind the inefficiency of an implicit conversion to a list, you can use "random.sample(s, 1)[0]". If only an arbitrary element is needed, use "x=next(iter(s))" to non-destructively fetch the first element.
msg96540 - (view) Author: Leo (lleeoo) Date: 2009-12-17 22:59
Thanks for the suggestions "random.sample(s, 1)[0]" and "x=next(iter(s))". A counterpoint: isn't this a classic example of where polymorphism enables a more elegant, simple, and clear (dare I say Pythonic) approach? The complexity of allowing sets as inputs, even with a naive implementation, is still O(n) even in the worst case vs. O(1) for arrays. Isn't documenting that distinction good enough? Thanks, Leo.
msg105138 - (view) Author: Thomas Dybdahl Ahle (thomasahle) * Date: 2010-05-06 11:29
Why not just add support to the set container? As far as I know, it is a binary search tree, so supporting random picking in O(logn) should be easy.
msg105139 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-05-06 11:40
> As far as I know, it is a binary search tree, It's not: it's based on a hash table. It's essentially a dict with keys but no values. An additional complication is that the hash table can be very sparsely filled, in the case of a large set that has had most of its elements deleted---there's no automatic shrinkage of the hash table in that case. So repeated random selection until you find a filled hash table entry would be inefficient in that case.
msg105160 - (view) Author: Thomas Dybdahl Ahle (thomasahle) * Date: 2010-05-06 19:48
I'm sorry. I see the problem then. Do you know, if there are any plans of adding a fast balanced binary search tree to pythons stdlib?
msg159699 - (view) Author: Michele Mazzucchi (michelem) Date: 2012-04-30 14:32
Folks, I really think this should be addressed. Python has beautiful data structure semantics, and this is a stain in them. An implementation based on the current underlying hash table is quite simple, just pick random addresses until an active key is found. Even on sparse tables this is probabilistic O(1). Even with average load factor = 50%, only 1 extra attempt is needed; 2 with LF as low as 33%. I'm happy to provide a patch if anyone defines the desired API in Include/setobject.h .
msg159721 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-04-30 18:25
Michele, you might want to raise this on the python-dev or python-ideas mailing list---it'll get better exposure there than in this closed issue. For completeness, see also the previous discussion in issue 936988.
History
Date User Action Args
2022-04-11 14:56:55 admin set github: 51771
2012-04-30 18:25:27 mark.dickinson set messages: +
2012-04-30 14:32:37 michelem set nosy: + michelemmessages: +
2010-05-06 19:48:40 thomasahle set messages: +
2010-05-06 11:40:40 mark.dickinson set messages: +
2010-05-06 11:29:57 thomasahle set nosy: + thomasahlemessages: +
2009-12-17 22:59:50 lleeoo set messages: +
2009-12-16 17:57:25 rhettinger set status: open -> closedresolution: rejectedmessages: +
2009-12-16 12:50:56 mark.dickinson set versions: + Python 2.7, Python 3.2, - Python 2.6nosy: + rhettinger, mark.dickinsonmessages: + assignee: rhettingertype: behavior -> enhancement
2009-12-16 07:52:40 lleeoo create