Issue 511073: urllib problems - Python tracker (original) (raw)

Issue511073

Created on 2002-01-31 07:25 by ybenita, last changed 2022-04-10 16:04 by admin. This issue is now closed.

Messages (3)
msg9069 - (view) Author: Yair Benita (ybenita) Date: 2002-01-31 07:25
when using urllib.urlopen("url") and then reading the file with handle.read() i get only parts of pages. it works for short webpages but if i use it to download large pages it always come too short. To me it looks that it tries to read the file before it is downloaded. Jack Jansen's said: MacPython may do short reads on sockets. I've always maintained that this was correct (which reasoning was quietly accepted by everyone here), but last year I finally admitted that it may actually be incorrect (which was again quietly accepted:-) example: x=urllib.urlopen("http://www.ebi.ac.uk/cgi-bin/emblf etch?db=embl&format=fasta&style=raw&id=AB002 378") print x.read() compare the file downloaded by any html browser and the file from macpython.
msg9070 - (view) Author: Jack Jansen (jackjansen) * (Python committer) Date: 2002-02-06 00:34
Logged In: YES user_id=45365 I probably found the cause for this, now the only task remaining is finding out who to blame:-) httplib explicitly sets non-buffering I/O on the file corresponding to the socket, by calling self.fp = socket.makefile("rb", 0). MSL, the CodeWarrior I/O library, has an optimization (or bug:-) that if you fread() from a binary file with buffering turned off it will call the underlying read() straight away. Python's fileobject.c file_read() reacts to a short fread() return value by returning. One of these three is wrong, apparently.
msg9071 - (view) Author: Jack Jansen (jackjansen) * (Python committer) Date: 2002-04-22 13:24
Logged In: YES user_id=45365 This was fixed some time ago (the fix made it into 2.2.1) by modifying the underlying GUSI I/O library. Apparently I forgot to close the bug report, so I'm doing so now.
History
Date User Action Args
2022-04-10 16:04:56 admin set github: 36005
2002-01-31 07:25:04 ybenita create