Issue 28071: Stop set.difference when set is empty (original) (raw)

Created on 2016-09-10 22:22 by terry.reedy, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
set_diff_early_out.diff rhettinger,2016-09-11 19:53 review
Messages (8)
msg275707 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-09-10 22:22
Proposal from SO: in the iteration loop for 'myset.difference(iterable)', add equivalent of "if not myset: break'. https://stackoverflow.com/questions/39378043/why-does-pythons-set-difference-method-take-time-with-an-empty-set In the toy example for testing that this is currently not so, myset starts empty, but it was noted by the OP that a more realistic example would be that myset becomes empty after deletions. Postulated reasons why not to do this (as opposed to why it has not been done yet ;-): 1) averaged over all uses, the time saved not iterating would be less than the time spent testing emptyness. 2) an implicit guarantee to complete the iteration for possible side-effects. One answer notes that myset.difference(anotherset) is special-cased and faster than the equivalent non-set iterable. I searched the tracker for 'empty set difference' and got no hits. If I remember, I will post any disposition of this issue back to SO.
msg275835 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-09-11 19:24
This is reasonable, cheap, and not hard to do. I'll whip-up a patch shortly.
msg275841 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-09-11 19:53
New timings look nice: $ py -m timeit -s "r = range(10 ** 4); s = set()" "s.difference(r)" 10000000 loops, best of 3: 0.104 usec per loop $ py -m timeit -s "r = set(range(10 ** 4)); s = set()" "s.difference(r)" 10000000 loops, best of 3: 0.105 usec per loop $ py -m timeit -s "r = range(10 ** 4); s = set()" "s.difference_update(r)" 10000000 loops, best of 3: 0.0659 usec per loop $ py -m timeit -s "r = set(range(10 ** 4)); s = set()" "s.difference_update(r)" 10000000 loops, best of 3: 0.0684 usec per loop
msg275962 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-09-12 05:02
New changeset fa0af1a6344d by Raymond Hettinger in branch 'default': Issue #28071: Add early-out for differencing from an empty set. https://hg.python.org/cpython/rev/fa0af1a6344d
msg275965 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-09-12 05:15
The patch covers sets that start empty, but not sets that become empty during iteration. Are you thinking about or rejecting the latter? (And should this then be closed?)
msg275967 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-09-12 05:43
I only want a single quick check upfront. It doesn't make much sense to put this in the inner loop.
msg306973 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-11-25 20:35
This optimization caused a behavior change: Python 3.5: >>> set().difference([[]]) Traceback (most recent call last): File "", line 1, in TypeError: unhashable type: 'list' Python 3.6: >>> set().difference([[]]) set()
msg345185 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-06-11 05:50
Behavior change in is tracked with
History
Date User Action Args
2022-04-11 14:58:36 admin set github: 72258
2019-06-11 05:50:18 xtreak set nosy: + xtreakmessages: +
2017-11-25 20:35:11 serhiy.storchaka set messages: +
2016-09-12 05:43:41 rhettinger set status: open -> closedresolution: fixedmessages: +
2016-09-12 05:31:37 serhiy.storchaka set nosy: + serhiy.storchaka
2016-09-12 05:15:15 terry.reedy set status: closed -> openresolution: fixed -> (no value)messages: +
2016-09-12 05:04:55 rhettinger set status: open -> closedresolution: fixed
2016-09-12 05:02:38 python-dev set nosy: + python-devmessages: +
2016-09-11 19:53:14 rhettinger set files: + set_diff_early_out.diffkeywords: + patchmessages: + stage: needs patch -> patch review
2016-09-11 19:24:08 rhettinger set messages: +
2016-09-11 19:19:53 rhettinger set assignee: rhettinger
2016-09-10 22:22:04 terry.reedy create