msg24618 - (view) |
Author: K Lars Lohn (lohnk) |
Date: 2005-03-15 00:39 |
Python 2.4 and Python 2.3.4 running under Suse 9.2 We're getting an AttributeError exception "AttributeError: 'NoneType' object has no attribute 'read'" within a very simple call to urllib.urlopen. This was discovered while working on Sentry 2, the new mirror integrity checker for the Mozilla project. We try to touch hundreds of URLs to make sure that the files are present on each of the mirrors. One particular URL kills the call to urllib.urlopen: http://mozilla.mirrors.skynet.be/pub/ftp.mozilla.org/firefox/releases/1.0/win32/en-US/Firefox%20Setup%201.0.exe This file probably does not exist on the mirror, however, in other cases of bad URLs, we get much more graceful failures when we try to read from the object returned by urllib.urlopen. >>> import urllib >>> urlReader = urllib.urlopen("http://mozilla.mirrors.skynet.be/pub/ftp.mozilla.org/firefox/releases/1.0/win32/en-US/Firefox%20Setup%201.0.exe") Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.4/urllib.py", line 77, in urlopen return opener.open(url) File "/usr/local/lib/python2.4/urllib.py", line 180, in open return getattr(self, name)(url) File "/usr/local/lib/python2.4/urllib.py", line 305, in open_http return self.http_error(url, fp, errcode, errmsg, headers) File "/usr/local/lib/python2.4/urllib.py", line 322, in http_error return self.http_error_default(url, fp, errcode, errmsg, headers) File "/usr/local/lib/python2.4/urllib.py", line 550, in http_error_default return addinfourl(fp, headers, "http:" + url) File "/usr/local/lib/python2.4/urllib.py", line 836, in __init__ addbase.__init__(self, fp) File "/usr/local/lib/python2.4/urllib.py", line 786, in __init__ self.read = self.fp.read AttributeError: 'NoneType' object has no attribute 'read' The attached file is a three line scipt that demos the problem. |
|
|
msg24619 - (view) |
Author: Jarek Zgoda (zgoda) |
Date: 2005-03-15 09:52 |
Logged In: YES user_id=92222 No such error on Windows: Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on win32 |
|
|
msg24620 - (view) |
Author: K Lars Lohn (lohnk) |
Date: 2005-03-15 16:50 |
Logged In: YES user_id=1239273 This problem is apparently transient depending on network conditions or, perhaps, the configuration of the server end. On 3/14 the problem has mysteriously vanished.... |
|
|
msg24621 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2005-03-15 19:09 |
Logged In: YES user_id=44345 Looking through the code I believe I traced the problem back to httplib.HTTP which sets self.fp to None when it's closed. It seems that urllib is trying to access this object after the connection's been closed. I realize the problem has passed for the moment, but have you considered using urllib2? The urllib library still uses httplib.HTTP which is really only there for backward compatibility. From this end it would be nice to leave urllib and httplib.HTTP alone. New apps should probably use urllib2 which uses the newer httplib.HTTPConnection class. |
|
|
msg24622 - (view) |
Author: K Lars Lohn (lohnk) |
Date: 2005-03-16 17:07 |
Logged In: YES user_id=1239273 I've changed over to urllib2. The only complication involved the exception handling model: urllib2's HTTPError exceptions cannot be pickled because they contain an open socket._fileobject. While mildly inconvenient, the workaround was not difficult. |
|
|
msg24623 - (view) |
Author: Roy Smith (roysmith) |
Date: 2005-04-02 21:44 |
Logged In: YES user_id=390499 Wow, this is bizarre. I just spend some time tracking down exactly this same bug and was just about to enter it when I saw this entry. For what it's worth, I can reliably reproduce this exception when fetching a URL from a deliberately broken server (well, at least I think it's broken; have to double-check the HTTP spec to be sure this isn't actually allowed) which produces headers but no body: (This is on Mac OSX-10.3.8, Python-2.3.4) ------------------------------- Roy-Smiths-Computer:bug$ cat server.py #!/usr/bin/env python from BaseHTTPServer import * class NullHandler (BaseHTTPRequestHandler): def do_GET (self): self.send_response (100) self.end_headers () server = HTTPServer (('', 8000), NullHandler) server.handle_request() ------------------------------ Roy-Smiths-Computer:bug$ cat client.py #!/usr/bin/env python import urllib urllib.urlopen ('http://127.0.0.1:8000') --------------------------------- Roy-Smiths-Computer:bug$ ./client.py Traceback (most recent call last): File "./client.py", line 5, in ? urllib.urlopen ('http://127.0.0.1:8000') File "/usr/local/lib/python2.3/urllib.py", line 76, in urlopen return opener.open(url) File "/usr/local/lib/python2.3/urllib.py", line 181, in open return getattr(self, name)(url) File "/usr/local/lib/python2.3/urllib.py", line 306, in open_http return self.http_error(url, fp, errcode, errmsg, headers) File "/usr/local/lib/python2.3/urllib.py", line 323, in http_error return self.http_error_default(url, fp, errcode, errmsg, headers) File "/usr/local/lib/python2.3/urllib.py", line 551, in http_error_default return addinfourl(fp, headers, "http:" + url) File "/usr/local/lib/python2.3/urllib.py", line 837, in __init__ addbase.__init__(self, fp) File "/usr/local/lib/python2.3/urllib.py", line 787, in __init__ self.read = self.fp.read AttributeError: 'NoneType' object has no attribute 'read' --------------------------------- I'll give urllib2 a try and see how that works. |
|
|
msg24624 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2005-12-15 22:10 |
Logged In: YES user_id=1188172 Duplicate of #767111. |
|
|
msg24625 - (view) |
Author: John Nagle (nagle) |
Date: 2007-04-14 04:28 |
The basic cause of the "NoneType" attribute error is a straightforward bug in "urllib2". If an error occurs during opening that causes an error to "http_error_default", a dummy file object is created using "addinfourl", so as to return something that looks like an empty file, rather than raising an exception. But that doesn't work if "getfile()" on the httplib.HTTP object returns "None", which is unusual but can happen. We're seeing this error in Python 2.4 on Windows. We're still trying to understand exactly what network situation forces this path, but it's quite real. It seems to be occurring on Python 2.5 on Linux, too. If you override http_error_default in a subclass, you get an HTTP error of "-1" reported when this situation occurs. |
|
|