Issue 1014992: bug in tarfile.ExFileObject.readline (original) (raw)
TarFile objects produce file-like objects to represent internal files in the archive. There appears to be a bug in the readline implementation for these objects.
When you specify the size parameter for readline, the standard function will try to read one line with at most size characters. However, in the tarfile module, it may return extra data beyond the end of the line (and also not clean out the trailing \r).
The problem is that after reading the maximum data, we break out of the loop before checking for a newline. This is fixed by modifying the while test.
This patch may break existing code which uses this 'feature'.
Logged In: YES user_id=21627
Thanks for the patch. Applied as tarfile.py 1.19 and 1.8.12.4, NEWS 1.831.4.148. I doubt this will break existing code since people typically invoke readline without arguments, and should expect that readline returns at most a line - hence the backport to 2.3.
As for dealing with \r: patches to support universal newlines would be welcome.