msg275707 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2016-09-10 22:22 |
Proposal from SO: in the iteration loop for 'myset.difference(iterable)', add equivalent of "if not myset: break'. https://stackoverflow.com/questions/39378043/why-does-pythons-set-difference-method-take-time-with-an-empty-set In the toy example for testing that this is currently not so, myset starts empty, but it was noted by the OP that a more realistic example would be that myset becomes empty after deletions. Postulated reasons why not to do this (as opposed to why it has not been done yet ;-): 1) averaged over all uses, the time saved not iterating would be less than the time spent testing emptyness. 2) an implicit guarantee to complete the iteration for possible side-effects. One answer notes that myset.difference(anotherset) is special-cased and faster than the equivalent non-set iterable. I searched the tracker for 'empty set difference' and got no hits. If I remember, I will post any disposition of this issue back to SO. |
|
|
msg275835 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2016-09-11 19:24 |
This is reasonable, cheap, and not hard to do. I'll whip-up a patch shortly. |
|
|
msg275841 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2016-09-11 19:53 |
New timings look nice: $ py -m timeit -s "r = range(10 ** 4); s = set()" "s.difference(r)" 10000000 loops, best of 3: 0.104 usec per loop $ py -m timeit -s "r = set(range(10 ** 4)); s = set()" "s.difference(r)" 10000000 loops, best of 3: 0.105 usec per loop $ py -m timeit -s "r = range(10 ** 4); s = set()" "s.difference_update(r)" 10000000 loops, best of 3: 0.0659 usec per loop $ py -m timeit -s "r = set(range(10 ** 4)); s = set()" "s.difference_update(r)" 10000000 loops, best of 3: 0.0684 usec per loop |
|
|
msg275962 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-09-12 05:02 |
New changeset fa0af1a6344d by Raymond Hettinger in branch 'default': Issue #28071: Add early-out for differencing from an empty set. https://hg.python.org/cpython/rev/fa0af1a6344d |
|
|
msg275965 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2016-09-12 05:15 |
The patch covers sets that start empty, but not sets that become empty during iteration. Are you thinking about or rejecting the latter? (And should this then be closed?) |
|
|
msg275967 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2016-09-12 05:43 |
I only want a single quick check upfront. It doesn't make much sense to put this in the inner loop. |
|
|
msg306973 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-11-25 20:35 |
This optimization caused a behavior change: Python 3.5: >>> set().difference([[]]) Traceback (most recent call last): File "", line 1, in TypeError: unhashable type: 'list' Python 3.6: >>> set().difference([[]]) set() |
|
|
msg345185 - (view) |
Author: Karthikeyan Singaravelan (xtreak) *  |
Date: 2019-06-11 05:50 |
Behavior change in is tracked with |
|
|