Issue 27144: concurrent.futures.as_completed() memory inefficiency (original) (raw)

Created on 2016-05-28 13:25 by grzgrzgrz3, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
reproduce.py grzgrzgrz3,2016-05-28 13:25
issue27144.patch grzgrzgrz3,2016-05-28 13:26 review
Pull Requests
URL Status Linked Edit
PR 1560 merged grzgrzgrz3,2017-05-12 18:42
PR 3266 merged pitrou,2017-09-01 17:01
PR 3270 merged pitrou,2017-09-03 10:26
PR 3271 merged pitrou,2017-09-03 13:11
Messages (15)
msg266552 - (view) Author: Grzegorz Grzywacz (grzgrzgrz3) * Date: 2016-05-28 13:25
as_complite generator keeps reference of all passed futures until StopIteration. It may lead to serious memory inefficiency. Solution is to remove reference from lists and yield future ad-hoc. I have submitted patch and reproduce sample. I can create backport for older versions if needed.
msg288708 - (view) Author: Will Vousden (willvousden) Date: 2017-02-28 12:15
Is there a reason this hasn't been merged yet? Fixing this bug locally (albeit just by setting Future._result = None when I've extracted the result) reduced the memory footprint of my program from 50 GB to 7 GB, so it seems worth it.
msg293435 - (view) Author: Mark DePristo (Mark DePristo) Date: 2017-05-10 16:42
Any update on this reviewing, patching, and then backporting to 2.7? We've just run into the exact same issue here in Google using a ThreadPoolExecutor.map call growing memory usage without bounds due to the Future holding onto its result even after being returned from the map call. There's a more specific patch possible for just map() itself, but this more general one seems like it'd be better.
msg293439 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-05-10 17:01
> Any update on this reviewing, patching, and then backporting to 2.7? concurrent package was added in 3.2. How backport to 2.7? Nosy myself.
msg293563 - (view) Author: Grzegorz Grzywacz (grzgrzgrz3) * Date: 2017-05-12 18:46
> We just ran into the exact same issue here in Google using a ThreadPoolExecutor.map call Looks like map got simillar issue. I created GitHub PR with map fixed. > Concurrent package was added in 3.2. How backport it 2.7? There is official, unofficial 2.7 backport. Git: https://github.com/agronholm/pythonfutures Pip: futures Ubuntu package: python-concurrent.futures
msg301136 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-09-01 16:54
New changeset 97e1b1c81458d2109b2ffed32ffa1eb643a6c3b9 by Antoine Pitrou (Grzegorz Grzywacz) in branch 'master': bpo-27144: concurrent.futures as_complete and map iterators do not keep reference to returned object (#1560) https://github.com/python/cpython/commit/97e1b1c81458d2109b2ffed32ffa1eb643a6c3b9
msg301141 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-09-01 17:16
New changeset ea767915f7476c1fe97f7b1a53304d57f105bdd2 by Antoine Pitrou in branch '3.6': [3.6] bpo-27144: concurrent.futures as_complete and map iterators do not keep reference to returned object (GH-1560) (#3266) https://github.com/python/cpython/commit/ea767915f7476c1fe97f7b1a53304d57f105bdd2
msg301142 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-09-01 17:20
I've merged the PR to master and backported it to 3.6. Thank you Grzegorz for contributing this!
msg301181 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-03 04:18
"concurrent.futures.as_completed() memory inefficiency" hum, sadly the commit 97e1b1c81458d2109b2ffed32ffa1eb643a6c3b9 introduced a reference leak. Example: $ ./python -m test -R 3:3 -test_concurrent_futures -m test.test_concurrent_futures.ProcessPoolAsCompletedTests.test_zero_timeout (...) test_concurrent_futures leaked [27, 27, 27] references, sum=81 test_concurrent_futures leaked [16, 17, 16] memory blocks, sum=49 (...)
msg301186 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-09-03 10:26
Thank you Victor. There is indeed a logic error in the new code. I'm opening a new PR.
msg301187 - (view) Author: Grzegorz Grzywacz (grzgrzgrz3) * Date: 2017-09-03 10:53
Tests are reusing finished futures. `_yield_and_decref` function do not clear waiters in finished futures. In the initial merge i propose to clear waiters but after review we decide it should be removed. I am confused now, should we change tests or restore initial `_yield_and_decref` function.
msg301189 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-09-03 13:09
New changeset 2ef37607b7aacb7c750d008b9113fe11f96163c0 by Antoine Pitrou in branch 'master': Fix a c.f.as_completed() refleak previously introduced in bpo-27144 (#3270) https://github.com/python/cpython/commit/2ef37607b7aacb7c750d008b9113fe11f96163c0
msg301190 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-09-03 13:30
New changeset 5cbca0235b8da07c9454bcaa94f12d59c2df0ad2 by Antoine Pitrou in branch '3.6': [3.6] Fix a c.f.as_completed() refleak previously introduced in bpo-27144 (GH-3270) (#3271) https://github.com/python/cpython/commit/5cbca0235b8da07c9454bcaa94f12d59c2df0ad2
msg301191 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-09-03 13:31
The regression should be fixed now.
msg301197 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-03 20:24
Thank you both for this nice enhancement.
History
Date User Action Args
2022-04-11 14:58:31 admin set github: 71331
2017-09-03 20:24:00 vstinner set messages: +
2017-09-03 13:31:34 pitrou set status: open -> closedresolution: fixedmessages: +
2017-09-03 13:30:58 pitrou set messages: +
2017-09-03 13:11:09 pitrou set pull_requests: + <pull%5Frequest3315>
2017-09-03 13:09:27 pitrou set messages: +
2017-09-03 10:53:22 grzgrzgrz3 set messages: +
2017-09-03 10:26:31 pitrou set pull_requests: + <pull%5Frequest3314>
2017-09-03 10:26:09 pitrou set messages: +
2017-09-03 04🔞31 vstinner set status: closed -> opennosy: + vstinnermessages: + resolution: fixed -> (no value)
2017-09-01 17:20:37 pitrou set status: open -> closedversions: - Python 3.5messages: + resolution: fixedstage: patch review -> resolved
2017-09-01 17:16:49 pitrou set messages: +
2017-09-01 17:01:58 pitrou set pull_requests: + <pull%5Frequest3310>
2017-09-01 16:54:02 pitrou set nosy: + pitroumessages: +
2017-05-12 18:46:28 grzgrzgrz3 set messages: +
2017-05-12 18:42:55 grzgrzgrz3 set pull_requests: + <pull%5Frequest1658>
2017-05-10 17:01:57 xiang.zhang set nosy: + xiang.zhangmessages: + versions: + Python 3.7
2017-05-10 16:42:23 Mark DePristo set nosy: + Mark DePristomessages: +
2017-02-28 12:15:52 willvousden set messages: +
2017-02-28 12:03:23 willvousden set nosy: + willvousden
2016-07-21 18:13:13 rnester set nosy: + rnester
2016-05-28 14:04:30 SilentGhost set stage: patch reviewversions: - Python 3.2, Python 3.3, Python 3.4
2016-05-28 13:26:47 grzgrzgrz3 set files: + issue27144.patchkeywords: + patch
2016-05-28 13:25:35 grzgrzgrz3 create