Issue 3066: FD leak in urllib2 (original) (raw)

Created on 2008-06-09 11:02 by bohdan, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
unnamed bohdan,2008-06-12 19:40
Messages (7)
msg67860 - (view) Author: Bohdan Vlasyuk (bohdan) Date: 2008-06-09 11:02
In urllib2.AbstractHTTPHandler.do_open, the following like creates a circular link: r.recv = r.read [r.read is a bound method, so it contains a reference to 'r'. Therefore, r now refers to itself.] If the GC is disabled or doesn't run often, this creates a FD leak. How to reproduce: import gc import urllib2 u = urllib2.urlopen("http://google.com") s = [ u.fp._sock.fp._sock ] u.close() del u print gc.get_referrers(s[0]) [<socket._fileobject object at 0xf7d42c34>, [<socket object, fd=4, family=2, type=1, protocol=6>]] I would expect that only one reference to the socket would exist (the "s" list itself). I can reproduce with 2.4; the problems seems to still exist in SVN HEAD.
msg67998 - (view) Author: Sharmila Sivakumar (sharmila) Date: 2008-06-11 17:13
Since the socket object is added to a list, a reference to the object always exists right? That would mean that it would not be garbage collected as long as the reference exists. On the other hand, it should also be noted that in close method, the socket is not explicitly closed and for a single urlopen, atleast 3 sockets are opened.
msg68074 - (view) Author: Bohdan Vlasyuk (bohdan) Date: 2008-06-12 19:40
The list is not the problem. The problem is the other reference, from "<socket._fileobject object at 0xf7d42c34>". Also note that the workaround (u.fp.recv = None) removes the second reference. This is fine (at least in CPython), because the socket is destroyed when the refcount reaches zero, thus calling the finalizer.
msg72147 - (view) Author: James Antill (nevyn) Date: 2008-08-29 18:28
So if I add a: class _WrapForRecv: def __init__(self, obj): self.__obj = obj def __getattr__(self, name): if name == "recv": name = "read" return getattr(self.__obj, name) ...and then change: r.recv = r.read ...into: r = _WrapForRecv(r) ...it stops the leak, and afaics nothing bad happens.
msg81787 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2009-02-12 17:50
Has (non-unittest) test and proposed (non-diff) patch inline.
msg86591 - (view) Author: DSM (dsm001) Date: 2009-04-26 02:17
I can't reproduce in python 2.5.4, 2.6.2, or 2.7 trunk (though I can with 2.4.6 and 2.5) on mac & linux. Quick bisection suggests that it was fixed in r53511 while solving related bug http://bugs.python.org/issue1601399, and the explanation given there is consistent with the symptom here: the _fileobject doesn't close itself, and r53511 makes sure that it does. Suggest closing as fixed.
msg87077 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-05-03 22:06
not reproducable in head as stated.
History
Date User Action Args
2022-04-11 14:56:35 admin set github: 47316
2009-05-03 22:06:10 gregory.p.smith set status: open -> closedresolution: fixedmessages: +
2009-04-26 02:17:35 dsm001 set nosy: + dsm001messages: +
2009-02-13 01:19:21 ajaksu2 set nosy: + jjlee
2009-02-12 17:50:35 ajaksu2 set nosy: + ajaksu2, orsenthilstage: test neededmessages: + versions: + Python 2.6, - Python 2.4
2008-09-22 01🔞50 gregory.p.smith set assignee: gregory.p.smithnosy: + gregory.p.smith
2008-08-29 18:28:33 nevyn set nosy: + nevynmessages: +
2008-06-12 19:40:26 bohdan set files: + unnamedmessages: +
2008-06-11 17:13:23 sharmila set nosy: + sharmilamessages: +
2008-06-09 11:02:32 bohdan create