msg45584 - (view) |
Author: Lars Gustäbel (lars.gustaebel) *  |
Date: 2004-03-17 15:59 |
I still develop tarfile.py sporadically on a separate branch (http://www.gustaebel.de/lars/tarfile/), and so there are two features from this branch that I'd like to propose for inclusion in Python's tarfile.py: 1. Overcoming the 8GB file size limit (8GB-limit.patch) At the moment it is not possible to add files to a tar archive that exceed 8GB size. Although this is POSIX compliant, GNU tar offers an extension header for largefiles that encodes file sizes in an 88-bit number instead of the common 11-digits octal number. Like all other GNU extensions in tarfile.py, this feature is turned on and off using the TarFile.posix attribute. 2. Automatic compression detection for the stream interface (stream-detect-compr.patch) tarfile.py's stream interface (which can be used to access tape devices or simply read a tar from stdin) is a bit difficult to use because it's not able to detect whether an archive is compressed or not. Compression has to be explicitly specified using mode ("r|", "r |
gz", "r |
bz2"). The patch introduces a fourth mode "r |
msg45585 - (view) |
Author: Neal Norwitz (nnorwitz) *  |
Date: 2004-07-20 22:28 |
Logged In: YES user_id=33168 I checked in the 8GB limit patch. Lib/tarfile.py 1.14. I didn't check in the stream patch for 2 reasons: 1) I don't know the need. Is this common? I've never heard of it. 2) The type parameter name was changed to comtype. I wasn't sure if this was necessary. It potentially (albeit unlikely) could break a program. I'm not concerned about changing the name of attribute. Lars, can you provide a good reason to add this part of the patch? If it's not likely to be used, I don't think it should be added. If it is added, there should also be a test. Thanks. |
|
|
msg45586 - (view) |
Author: Neal Norwitz (nnorwitz) *  |
Date: 2004-07-20 22:31 |
Logged In: YES user_id=33168 Lars, could you look at bug 949052 and provide any guidance? Thanks. |
|
|
msg45587 - (view) |
Author: Lars Gustäbel (lars.gustaebel) *  |
Date: 2004-07-21 07:54 |
Logged In: YES user_id=642936 tarfile.py's stream interface must be used if the user wants to read an archive that is not a seekable file, e.g. stdin or a tape device. ATM, it is the user's job to find out whether the stream is compressed (mode="r|gz" or "r |
bz2") or uncompressed (mode="r |
"), which makes the stream interface kind of awkward and unusable for many users. The patch introduces an additional mode "r |
msg45588 - (view) |
Author: Lars Gustäbel (lars.gustaebel) *  |
Date: 2004-07-21 12:55 |
Logged In: YES user_id=642936 I just created tests for the stream-detect-compr.patch, attached as test.patch. BTW, I examined bug #949052, and opened a patch (#995126). |
|
|
msg45589 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2005-03-04 19:58 |
Logged In: YES user_id=21627 Lars, the streaming patch is outdated. If you still think it is necessary, please update the patch. While I can understand what the feature "automatic detection" does, I fail to see why you need a new syntax for open. AFAICT, "r" is equivalent to the newly-proposed "r:*". Why is it necessary to have two ways to spell the same thing? |
|
|
msg45590 - (view) |
Author: Lars Gustäbel (lars.gustaebel) *  |
Date: 2005-03-05 11:37 |
Logged In: YES user_id=642936 The asterisk notation is necessary only for the stream interface There are the three possible modes "r|", "r |
gz" and "r |
bz2", and "r |
msg45591 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2005-03-05 12:48 |
Logged In: YES user_id=21627 Thanks for the patch and the explanation; committed as libtarfile.tex 1.9 tarfile.py 1.27 test_tarfile.py 1.18 NEWS 1.1268 |
|
|