Issue 511073: urllib problems - Python tracker (original) (raw)
Issue511073
Created on 2002-01-31 07:25 by ybenita, last changed 2022-04-10 16:04 by admin. This issue is now closed.
Messages (3) | ||
---|---|---|
msg9069 - (view) | Author: Yair Benita (ybenita) | Date: 2002-01-31 07:25 |
when using urllib.urlopen("url") and then reading the file with handle.read() i get only parts of pages. it works for short webpages but if i use it to download large pages it always come too short. To me it looks that it tries to read the file before it is downloaded. Jack Jansen's said: MacPython may do short reads on sockets. I've always maintained that this was correct (which reasoning was quietly accepted by everyone here), but last year I finally admitted that it may actually be incorrect (which was again quietly accepted:-) example: x=urllib.urlopen("http://www.ebi.ac.uk/cgi-bin/emblf etch?db=embl&format=fasta&style=raw&id=AB002 378") print x.read() compare the file downloaded by any html browser and the file from macpython. | ||
msg9070 - (view) | Author: Jack Jansen (jackjansen) * ![]() |
Date: 2002-02-06 00:34 |
Logged In: YES user_id=45365 I probably found the cause for this, now the only task remaining is finding out who to blame:-) httplib explicitly sets non-buffering I/O on the file corresponding to the socket, by calling self.fp = socket.makefile("rb", 0). MSL, the CodeWarrior I/O library, has an optimization (or bug:-) that if you fread() from a binary file with buffering turned off it will call the underlying read() straight away. Python's fileobject.c file_read() reacts to a short fread() return value by returning. One of these three is wrong, apparently. | ||
msg9071 - (view) | Author: Jack Jansen (jackjansen) * ![]() |
Date: 2002-04-22 13:24 |
Logged In: YES user_id=45365 This was fixed some time ago (the fix made it into 2.2.1) by modifying the underlying GUSI I/O library. Apparently I forgot to close the bug report, so I'm doing so now. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:04:56 | admin | set | github: 36005 |
2002-01-31 07:25:04 | ybenita | create |