msg125654 - (view) |
Author: Peter Creath (pjcreath) |
Date: 2011-01-07 16:23 |
Calling wave.close() fails to release all references to the file passed in via wave.open(filename_or_obj, "rb"). As a result, processing many wave files produces an IOError of too many files open. This bug is often masked because this dangling reference is collected if the wave object is collected. However, if the wave object is retained, calling wave_obj.close() won't release the reference, and so the file will never be closed. There are two solutions: 1) The workaround: the client program can explicitly close the file object it passed to the wave object ("file_obj.close()"). 2) The bug fix: the wave module can properly release the extra reference to the file, by setting "self._data_chunk = None" in the close() method. Explanation: Trunk code (and 2.7.1, and older): def close(self): if self._i_opened_the_file: self._i_opened_the_file.close() self._i_opened_the_file = None self._file = None but note initfp(self, file): ... self._file = Chunk(file, bigendian = 0) ... chunk = Chunk(self._file, bigendian = 0) ... self._data_chunk = chunk ... therefore close needs to add: self._data_chunk = None |
|
|
msg125750 - (view) |
Author: Ned Deily (ned.deily) *  |
Date: 2011-01-08 05:51 |
Thanks for the report and analysis. Would you care to submit a patch to fix it? |
|
|
msg125751 - (view) |
Author: Ned Deily (ned.deily) *  |
Date: 2011-01-08 05:54 |
(Presumably this is also a problem for Python 3, as well). |
|
|
msg125772 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2011-01-08 09:45 |
This is not a bug in the implementation: the file object is only closed when you passed a file name to open(). Like other APIs that allow file names or objects to be passed in, it is the caller's responsibility to close the file object if an object was passed. However, this was not documented. I've fixed that with r87859. |
|
|
msg126106 - (view) |
Author: Peter Creath (pjcreath) |
Date: 2011-01-12 15:31 |
Thank you for clarifying the documentation. However, I don't think that fully resolves the issue. I'm not complaining about a failure to close the file. As you observe, it doesn't need to (and shouldn't) close a file object, but it should release the reference. The code already tries to release the reference ("self._file = None"). It just fails to release it correctly, missing the other reference to the file object (self._data_chunk). That's the bug. Your clarification of the documentation is appreciated nonetheless. I've attached a patch as Ned requested. The same patch can currently be applied to release27-maint, release31-maint, and py3k. (The line numbers and surrounding context are identical.) |
|
|
msg126130 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2011-01-12 19:24 |
I don't really see the bug here. Either you openened the file object, then you have to close it. Or wave.py opened it, then it will close it, no matter if it still has a reference or not. Of course we could set _data_chunk to None, but I'm unsure what behavior change you would expect from that. |
|
|
msg126131 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2011-01-12 19:30 |
Agreed with Georg. No OS resource is leaking if the file is explicitly closed (since it releases the underlying file descriptor). That the Python "file object" is still attached somewhere is of secondary importance. |
|
|
msg126133 - (view) |
Author: Peter Creath (pjcreath) |
Date: 2011-01-12 20:16 |
A point of clarification on the original report: Georg is completely right when he points out that this is only an issue when passing in a file object. If passed a filename, wave.py both opens and closes the file explicitly, and the dangling reference isn't important, as Antoine observes. However, a retained reference in the file-object case is still a leak. Georg writes: "Of course we could set _data_chunk to None, but I'm unsure what behavior change you would expect from that." It allows garbage collection to close the file object if there are no more references to it. It seems reasonable for a client of wave.py to assume that close() will release all references to the object, and indeed the code seems to support that assumption -- it sets _file to None. If releasing references were truly of no importance, then I would argue that the line setting _file to None should be removed. It serves no purpose after wave.py has explicitly closed the file (if it opened it) other than to release a reference to the file object. Therefore, I suggest that _data_chunk should also be set to None in order to release the reference completely, thereby allowing the file object to be garbage collected. |
|
|
msg126137 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2011-01-12 21:05 |
> It allows garbage collection to close the file object if there are no > more references to it. This is a very bad policy to begin with. Garbage collection can be delayed for a number of reasons: - someone might be running your program on a Python implementation which doesn't use reference counting (such as Jython or PyPy) - an exception, together with its traceback object, might capture the value of some local variables and keep them alive (that is, reachable from the GC's point of view) - a reference cycle might delay proper resource cleanup until the cyclic garbage collector kicks in So the good thing to do is to close your file explicitly. Luckily, Python 2.6 and upwards makes it easier by using the "with" statement. IMO this issue should be closed. |
|
|
msg176382 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-11-25 18:31 |
I am not sure, should this issue be closed as "rejected" because the suggested patch for Lib/wave.py was rejected, or as "fixed" because Georg was committed the documentation patch (changeset 8239ec6f39e6) for this issue? |
|
|
msg176389 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2012-11-25 20:47 |
I say "fixed": there was a bug (undocumented, but correct behavior) and that was fixed. |
|
|