Issue 13159: _io.FileIO uses a quadratic-time buffer growth algorithm (original ) (raw ) This issue has been migrated to GitHub: https://github.com/python/cpython/issues/57368
classification
process
Created on 2011-10-12 16:02 by nadeem.vawda , last changed 2022-04-11 14:57 by admin . This issue is now closed .
Messages (7)
msg145403 - (view)
Author: Nadeem Vawda (nadeem.vawda) *
Date: 2011-10-12 16:01
As mentioned in issue 6715 , the buffer growth strategy used by _io.FileIO has quadratic running time if each reallocation is O(n). The code in question is new_buffersize() from Modules/_io/fileio.c : if (currentsize > SMALLCHUNK) { /* Keep doubling until we reach BIGCHUNK; then keep adding BIGCHUNK. */ if (currentsize <= BIGCHUNK) return currentsize + currentsize; else return currentsize + BIGCHUNK; } return currentsize + SMALLCHUNK; Is there a reason for this? One possible improvement would be to instead use the same formula as list.resize() in Objects/listobject.c : new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6); which has amortized O(n) running time behaviour. Your thoughts?
msg145418 - (view)
Author: Nadeem Vawda (nadeem.vawda) *
Date: 2011-10-12 17:19
I've attached a patch that makes the suggested change to FileIO, and also to _bz2.BZ2Compressor/Decompressor (which currently have the same issue).
msg145443 - (view)
Author: Antoine Pitrou (pitrou) *
Date: 2011-10-12 23:38
The patch looks good to me, thanks.
msg145448 - (view)
Author: Roundup Robot (python-dev)
Date: 2011-10-13 11:45
New changeset d18c80a8c119 by Nadeem Vawda in branch '3.2': Issue #13159 : Replace FileIO's quadratic-time buffer growth algorithm with a linear-time one. http://hg.python.org/cpython/rev/d18c80a8c119 New changeset 4a6709a071d0 by Nadeem Vawda in branch 'default': Merge #13159 : Replace FileIO's quadratic-time buffer growth algorithm with a linear-time one. http://hg.python.org/cpython/rev/4a6709a071d0
msg145449 - (view)
Author: Roundup Robot (python-dev)
Date: 2011-10-13 11:58
New changeset c1c434e30e06 by Nadeem Vawda in branch '2.7': Issue #13159 : Replace FileIO's quadratic-time buffer growth algorithm with a linear-time one. http://hg.python.org/cpython/rev/c1c434e30e06
msg145450 - (view)
Author: Antoine Pitrou (pitrou) *
Date: 2011-10-13 12:04
Thank you :)
msg145453 - (view)
Author: Nadeem Vawda (nadeem.vawda) *
Date: 2011-10-13 14:01
No problem :)
History
Date
User
Action
Args
2022-04-11 14:57:22
admin
set
github: 57368
2011-10-13 14:01:59
nadeem.vawda
set
status: open -> closedmessages: + assignee: nadeem.vawda resolution: fixedstage: patch review -> resolved
2011-10-13 12:04:31
pitrou
set
messages: +
2011-10-13 11:58:31
python-dev
set
messages: +
2011-10-13 11:45:44
python-dev
set
nosy: + python-dev messages: +
2011-10-12 23:38:22
pitrou
set
messages: + versions: + Python 2.7, Python 3.2
2011-10-12 17:24:31
vstinner
set
nosy: + vstinner
2011-10-12 17:20:01
nadeem.vawda
set
stage: needs patch -> patch review
2011-10-12 17:19:15
nadeem.vawda
set
files: + buffer-growth.diff keywords: + patch messages: +
2011-10-12 16:02:00
nadeem.vawda
create