[Python-Dev] very bad network performance (original) (raw)

Bill Janssen janssen at parc.com
Mon Apr 14 20:36:33 CEST 2008


There's some really convoluted code in socket._fileobject.init() here. When initializing a _fileobject, if the 'bufsize' parameter is explicitly given as zero, that's turned into an _rbufsize of 1, which, combined with the 'min' change, will produce the read-one-byte behavior. The code for setting _rbufsize seems odd; be nice if it was commented with some notes on why these specific selections were made.

    if bufsize < 0:
        bufsize = self.default_bufsize
    if bufsize == 0:
        self._rbufsize = 1
    elif bufsize == 1:
        self._rbufsize = self.default_bufsize
    else:
        self._rbufsize = bufsize
    self._wbufsize = bufsize

It also depends on whether 'read' is called with an explicit # of bytes to read (which appears to be the case here).

So, it's not the code in socket.py, necessarily; it's the code which opens the socket, most likely. The only library which seems to use a bufsize of zero is httplib (which has a lot of other problems as well). I think the change cited below (while IMO correct) will affect a number of other HTTP-based services, as well.

Bill

Ralf,

Terry is right. Please file a bug. I do think there may be a problem with that change but I don't have the time to review it in depth. Hopefully others will. I do recall that sockets reading one byte at a time has been a problem before -- I recall a bug about this in the 1.5.2 era for Windows... Too bad it's back. :-( --Guido On Mon, Apr 14, 2008 at 10:25 AM, Terry Reedy <tjreedy at udel.edu> wrote: > > "Ralf Schmitt" <schmir at gmail.com> wrote in message > news:932f8baf0804140912u54adc7d5md7261541857f21bd at mail.gmail.com... > > > | Hi all, > | > | I'm using mercurial with the release25-maint branch. I noticed that > checking > | out a local repository now takes more than > | 5 minutes (it should be around 30s). > | > | I've tracked it down to this change: > | http://hgpy.de/py/release25-maint/rev/e9446c6ab3cd > | this is svn revision 61009. Here is the diff inline: > | > | --- a/Lib/socket.py Fri Mar 23 14:27:29 2007 +0100 > | +++ b/Lib/socket.py Sat Feb 23 20:30:59 2008 +0100 > | @@ -305,7 +305,7 @@ > | self.rbuf = "" > | while True: > | left = size - buflen > | - recvsize = max(self.rbufsize, left) > | + recvsize = min(self.rbufsize, left) > | data = self.sock.recv(recvsize) > | if not data: > | break > | > | > | > | self.rbufsize if 1, and so the code reads one byte at a time. this is > | clearly wrong, I'm posting it to the mailing list, as I don't want > | this issue to get lost in the bugtracker. > > -------------------------------------------------------------------------------- > > It is at least as likely to get lost here. There is a mailing list for new > tracker items that many devs subscribe to.



More information about the Python-Dev mailing list