Issue 1484695: tarfile.py fix for #1471427 and updates (original) (raw)

Created on 2006-05-09 13:51 by lars.gustaebel, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
tarfile-update.diff lars.gustaebel,2006-05-09 13:51
testtar.tar lars.gustaebel,2006-05-09 13:52
tarfile-headererror.diff lars.gustaebel,2006-05-18 10:00 patch to tarfile.py against r46040
tarfile-diffs.tar.gz lars.gustaebel,2006-07-09 08:01
complete-with-headererror.diff nnorwitz,2006-07-10 00:23 2.6 patch to use HeaderError
tests-for-1484695.diff lars.gustaebel,2006-12-20 11:51 testcase for HeaderError exception
Messages (17)
msg50205 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2006-05-09 13:51
I have assembled a patch that adds some features from my own development path of tarfile.py (http://www.gustaebel.de/lars/tarfile/) and fixes #1471427 which made some restructuring necessary. The patch affects Lib/tarfile.py, Lib/test/test_tarfile.py and Doc/lib/libtarfile.tex. The changes the patch makes are as follows: Sets the version to 0.8.0. Support for base256 encoding of number fields (nti() and itn()). Up to now this was hardcoded for the filesize field to allow filesizes greater than 8 GB but it is applicable to all number fields. TarInfo.tobuf() has a boolean argument "posix" which controls how number fields are written (base256 is non-posix). Both unsigned and signed (Sun and NeXT) checksums are calculated. Header validation moves from TarFile.next() to TarInfo.frombuf(). A header is valid only if its checksum is okay, in the past the checksum was calculated but ignored. The TarFile.next() method was rearranged which makes header processing clearer and more abstract and fixes bug #1471427. However, this change breaks the interface for subclassing in order to implement custom member types but makes it much easier at the same time. The mapping TYPE_METH was removed. A new test ReadGNULongTest was added to test_tarfile.py and testtar.tar was updated to be able to test the GNU extensions LONGNAME and LONGLINK.
msg50206 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2006-05-09 13:52
Logged In: YES user_id=642936 Here is testtar.tar to replace Lib/test/testtar.tar.
msg50207 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-05-10 16:26
Logged In: YES user_id=849994 Thanks for the patch, applied as rev. 45954.
msg50208 - (view) Author: Jim Jewett (jimjjewett) Date: 2006-05-17 22:25
Logged In: YES user_id=764593 (1) Why change the exception style? When raising an instance, the style guide (PEP-8, http:// <www.python.org/dev/peps/pep-0008/>) prefers to construct that instance; the older form is left over from String exceptions and will be removed in Python 3. I could understand leaving them as they were, but if you're going to change them to make them consistent, why not use the current format? (2) Why get rid of the debug messages (such as the checksum check) entirely? Guarding them in if self.debug, I would understand. (3) I wouldn't count on str(e) (where e is any ValueError instance) being as meaningful as the (current version's) ReadError("empty, unreadable or compressed file")
msg50209 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-05-18 06:11
Logged In: YES user_id=849994 Jim: I agree with your comments and have committed an improved version.
msg50210 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2006-05-18 10:00
Logged In: YES user_id=642936 On (1): agreed. On (2): There is still a debug message emitted for a bad checksum: In TarInfo.frombuf() at the bottom a ValueError is raised if they don't match and is passed on to TarFile.next() where it is put out as a debug message using the _dbg() method in the except clause. The debug message where it is now (r46040) is senseless because the try-block will be left when TarInfo.frombuf() fails . On (3): You're right, I attached a patch that adds another Exception HeaderError which is raised in TarInfo.frombuf() instead of ValueError in case of a bad header. I hope that is acceptable.
msg50211 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2006-07-03 08:59
Logged In: YES user_id=642936 I would like to emphasize on point (2) of my previous post, which should be resolved before 2.5 comes out.
msg50212 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-07-04 22:04
Logged In: YES user_id=33168 Lars, I don't see the debug (_dbg) call. Which version are you using? I didn't see r46040 in Lib/tafile.py for either 2.5 or 2.4. Could you point me to file/line/version where there's a problem (or just make a patch)? Thanks.
msg50213 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2006-07-09 08:01
Logged In: YES user_id=642936 I attached a tar with two diffs: only-tarfile.diff only removes the debug message. complete-with-headererror.diff removes the debug message and adds the HeaderError exception to TarInfo.frombuf() mentioned below.
msg50214 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-07-10 00:23
Logged In: YES user_id=33168 Ah, I see. I checked in the patch to remove the debug message since it's dead code. We are in feature freeze for 2.5, so I've updated the patch, attached it, set the group to 2.6. We can add the patch after 2.5 is out in a month or so. Committed revision 50503.
msg50215 - (view) Author: Thomas Waldmann (thomaswaldmann) Date: 2006-09-30 19:28
Logged In: YES user_id=100649 Could someone please backport this to 2.4 branch? Currently, 2.4.4c0 is broken when you try to extract a file that happens to have a "/" at index 99 in its name (tarinfo.name ends with "/" in this case, but this is NOT the end of the complete (LONG) filename). It gets misinterpreted as directory in this case. MoinMoin build process is broken on 2.4.4c0 due to this bug.
msg50216 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-12-19 20:55
What's the status of this patch? Does it still have to be applied somewhere?
msg50217 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2006-12-19 21:50
Yes, please apply complete-with-headererror.diff to the trunk.
msg50218 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-12-19 22:07
Applied in rev. 53090.
msg50219 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-12-20 05:33
Lars, could you also create a patch to test this? Thanks!
msg50220 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2006-12-20 11:51
Created a testcase (which revealed an inconsistency). Thanks for the tip! Attached is a patch against test_tarfile.py and tarfile.py. File Added: tests-for-1484695.diff
msg50221 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-12-20 11:55
Applied as rev. 53106.
History
Date User Action Args
2022-04-11 14:56:17 admin set github: 43337
2006-05-09 13:51:17 lars.gustaebel create