Issue 2601: [regression] reading from a urllib2 file descriptor happens byte-at-a-time (original) (raw)

Created on 2008-04-08 21:15 by doko, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (10)
msg65219 - (view) Author: Matthias Klose (doko) * (Python committer) Date: 2008-04-08 21:15
r61009 on the 2.5 branch - Bug #1389051, 1092502: fix excessively large memory allocations when calling .read() on a socket object wrapped with makefile(). causes a regression compared to 2.4.5 and 2.5.2: When reading from urllib2 file descriptor, python will read the data a byte at a time regardless of how much you ask for. python versions up to 2.5.2 will read the data in 8K chunks. This has enough of a performance impact that it increases download time for a large file over a gigabit LAN from 10 seconds to 34 minutes. (!) Trivial/obvious example code: f = urllib2.urlopen("http://launchpadlibrarian.net/13214672/nexuiz-data_2.4.orig.tar.gz") while 1: chunk = f.read() ... and then strace it to see the recv()'s chugging along, one byte at a time.
msg65488 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-04-14 22:11
See #2632 for more discussion of what is probably the same issue.
msg65503 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-04-15 06:11
Bumping the priority. I'd like to see this fixed before the next release. What version(s) does this problem apply to: 2.5, 2.6, 3.0?
msg65504 - (view) Author: Ralf Schmitt (schmir) Date: 2008-04-15 06:21
quoting http://bugs.python.org/issue1389051: "Applied to 2.6 trunk in rev. 61008 and to 2.5-maint in rev. 61009." I don't know about py3k...
msg65517 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2008-04-15 13:15
It was applied to 2.5-maint after 2.5.2 was released, BTW, so the change isn't in any stable released version, only the 2.6 alphas.
msg65538 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-04-16 01:23
So if the fix was applied to 2.5 branch and 2.6 (3.0 should have picked up from 2.6 automatically), can we close this bug?
msg65539 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-04-16 02:18
I don't think the fix was acceptable. Now python spins consuming all cpu trying to read trivial amounts of data one byte at a time... See the discusson at the end of http://bugs.python.org/issue1092502 as well as a recent python-dev thread: http://mail.python.org/pipermail/python-dev/2008-April/078613.html
msg65540 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-04-16 02:21
or else i'm missing something here in the maze of three bugs talking about the same issue.. which revisions fixed the introduced performance issue?
msg65545 - (view) Author: Ralf Schmitt (schmir) Date: 2008-04-16 06:02
me and amk are talking about the commit that introduced this bug (which was meant as a fix for another bug). neal seems to think that this commit is the fix to this bug itself. and gregory, you are now confused :) hope it's clear now.
msg65990 - (view) Author: Mark Hammond (mhammond) * (Python committer) Date: 2008-04-30 05:55
For those trying to follow along at home: best I can tell we have 3 other issues on this: #1092502 and #1389051 are dupes of an initial bug, but the fix for those bugs caused regressions reported in this bug and in #2632. To try and reduce confusion I'm closing this as a dupe of #2632 which has a patch for review.
History
Date User Action Args
2022-04-11 14:56:33 admin set nosy: + barrygithub: 46853
2008-04-30 05:59:01 mhammond set resolution: duplicate
2008-04-30 05:56:14 mhammond set status: open -> closed
2008-04-30 05:56:00 mhammond set nosy: + mhammondsuperseder: performance problem in socket._fileobject.readmessages: +
2008-04-16 06:02:02 schmir set messages: +
2008-04-16 02:21:14 gregory.p.smith set messages: +
2008-04-16 02🔞30 gregory.p.smith set nosy: + gregory.p.smithmessages: +
2008-04-16 01:23:34 nnorwitz set messages: +
2008-04-15 13:15:04 akuchling set messages: +
2008-04-15 06:21:49 schmir set messages: + versions: + Python 2.6
2008-04-15 06:11:49 nnorwitz set priority: critical -> release blockernosy: + nnorwitzmessages: +
2008-04-14 22:47:31 schmir set nosy: + schmir
2008-04-14 22:11:04 pitrou set nosy: + pitroumessages: +
2008-04-09 19:21:30 georg.brandl set priority: high -> critical
2008-04-08 21:15:29 doko create