Explanation from dablitz's comment at https://bugs.pypy.org/issue867 : urllib2 in the stdlib leaks fd's if an exception is raised while opening a connection. The issue occurs due to a socket being opened then an exception being raised before an object with the socket is returned, leaving no way to explicitly close the object. On cpython this would not be an issue as the object would lose all references almost immediately however it lingers around with a proper GC causing FD's to build up if the same condition happens repeatedly (eg a loop/web crawling) The file enclosed is a script to generate the leakage, to run invok it as follows leak.py pypy should start leaking FD's and can be see in /proc//fd Related issues: http://bugs.python.org/issue3066http://bugs.python.org/issue1208304http://bugs.python.org/issue1601399
Pay attention not to introduce regressions like the one in #12576 while fixing this. I'm not sure there are similar tests for urllib2 -- if not they should be added.
This issue is a duplicate of bpo-12133 which has been fixed in Python 2.7 by: commit c74a6ba2d6c1f331896cf8dacc698b0b88c7e670 Author: Victor Stinner <victor.stinner@haypocalc.com> Date: Fri Jun 17 14:06:27 2011 +0200 Issue #12133: AbstractHTTPHandler.do_open() of urllib.request closes the HTTP connection if its getresponse() method fails with a socket error. Patch written by Ezio Melotti.