bpo-31182: enhancements to tarfile and zipfile's CLI by GadgetSteve · Pull Request #3084 · python/cpython (original) (raw)
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Added command line -m functionality to Lib\tarfile.py and Lib\zipile.py as follows:
- Wildcards (glob style)
- Recursion
- Add to existing repository, (uncompressed only for tars).
https://bugs.python.org/issue31182
…n recurse of directories. Added --add option.
…odified test_zipfile to test all verbosity levels of list command
brettcannon changed the title
bpo-31182 bpo-31182: enhancements to tarfile and zipfile's CLI
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the enhancements!
description = 'A simple command-line interface for tarfile module.' |
parser = argparse.ArgumentParser(description=description) |
epilog= ' '.join([ |
'When creating compression is determined by a tarfile', |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"When creating a tarfile, compression is determined by the filename's final extension of any of %s. "
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, much clearer.
help='Create tarfile from files.') |
---|
group.add_argument('-a', '--add', nargs='+', |
metavar=('', '<file|dir>'), |
help='Append files to a tarfile, (.tar only).') |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Append files to a tarfile (uncompressed only)?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, clearer.
tar_mode = 'w:' + compressions[ext] if ext in compressions else 'w' |
---|
tar_files = args.create |
if args.verbose: |
print('Create {!r}:'.format(tar_name)) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Creating {!r}".format(tar_name)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
# Actual adding files is done later. |
---|
elif args.add is not None: |
tar_name = args.add.pop(0) |
_, ext = os.path.splitext(tar_name) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might I suggest a bit better name than '_'? If it's unused, which it appears to be, ignored_name would do fine.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used dummy
as ignored by pylint.
if args.verbose: |
---|
print('{!r} file created.'.format(tar_name)) |
print('Add to {!r}:'.format(tar_name)) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Adding to {!r}:".format(tar_name)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to adding if the file exists and to creating if it doesn't
'If a directory is passed as a parameter to create or append', |
---|
'it will be recursively added. Wildcards are supported, glob style,', |
'and a wildcard such as **/*.py will recurse. All recursion will', |
'skip any directories starting with . such as .git, etc.' |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A flag to override this behavior would be nice (provided it still ignores os.curdir and os.pardir!). And you should probably add info on its behavior on encountering a symbolic link, if applicable on the given filesystem. (Note the possibility for recursion if it follows directory symbolic links!)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Help message in both tarfile and zipfile updated to include information on how to override, .**/*
and that this is as for glob.iglob() which is used.
for nm in os.listdir(path): |
---|
addToZip(zf, |
os.path.join(path, nm), os.path.join(zippath, nm)) |
if not os.path.split(path)[-1].startswith('.'): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If verbose is set high enough, a warning about skipped directories (or included ones thanks to the flag I suggested above) would be good - that way, the user is more likely to realize when they should have included - or excluded - such directories.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
glob.iglob doesn't have an option to give me this information - I was trying to avoid using os.walk and walking, filtering myself. Added:
if file_name.startswith('**/'):
print("Recursing directories that don't start with .")
elif file_name.startswith('.**/'):
print("Recursing directories that do start with .")
else: |
---|
if args.verbose: |
print("Looking for wildcard matches to", path) |
matches = glob.iglob(path) # Might be a just wildcard |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be just a wildcard?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment made clearer and moved.
if args.verbose: |
---|
print("Looking for wildcard matches to", path) |
matches = glob.iglob(path) # Might be a just wildcard |
for nm in matches: # if not then this will have no entries. |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If something definitely is a wildcard, then not matching anything (either at all, or in a specifically named directory) should probably be reported.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
zippath = os.path.basename(path) |
---|
if not zippath: |
zippath = os.path.basename(os.path.dirname(path)) |
if zippath in ('', os.curdir, os.pardir): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why check for '' and then replace it with itself?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Hello, and thanks for your contribution!
I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).
Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.
Thanks again to your contribution and we look forward to looking at it!
…thon into fix-issue-300156
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch! Left some minor comments. I think the doc would need a pass to improve the prose.
I wonder if it makes sense to add all these features in one PR. Do they depend or build on each other, or was it simply convenient to work on them all at once?
I’m adding the maintainers of tarfile and zipfile as reviewers to get their opinio.
@@ -1875,7 +1875,7 @@ def list(self, verbose=True, *, members=None): |
---|
""" |
self._check() |
if members is None: |
if members is None or members == []: # None or empty list |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think PEP 8 recommends to not add comments that just duplicate the code. This one says what the code is doing, but not why it is desirable.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two sets of changes are largely duplicates and try to make the command lines equivalent between the two.
Good point on the comment, changed to why needed.
Spell Checked and a couple of minor changes to the News entry is that the doc that you had in mind?
@@ -2447,20 +2447,52 @@ def is_tarfile(name): |
---|
def main(): |
import argparse |
compressions = { |
# gz |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, I don’t see the added values of the comments here.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
description = 'A simple command-line interface for tarfile module.' |
parser = argparse.ArgumentParser(description=description) |
epilog= ' '.join([ |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tip: the Python compiler will automatically merge string literals that follow each other.
epilog = ('Beginning of text, ' 'followed by more text.')
Here you also need dynamic text replacement; a simple way to do that is to use an f-string:
available_compressions = ', '.join(compressions) epilog = f'final extension (one of {available_compressions})'
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good tips, (I guess I am still stuck in python 2 mode). Also reworked tarfile.main & zipfile.main docstrings to make more grammatical, (and better spelt).
Also added note to zipfile epilogue that only a single compress can be used at invocation but later adds can use a different compression.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also only offer available compression methods in zipfile.main options.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And only mention available compression methods as currently available.
I'll try to do a reread+edit of the docs, but am not sure when I can get around to it.
I may be able to work on documentation this weekend (family stuff prevented earlier). Which of the below need working on the most?
- Outputs during use of the command-line interface
- Clarification comments in the code
- Library documentation of the modules in general
See my comments on the tracker. Discussions about the design should be on the tracker, GitHub is only for technical review of the concrete patch.
@serhiy-storchaka - I'm not sure why documentation modifications wouldn't be part of the "concrete patch", but I'll take a look at the tracker when I have a moment.
@drallensmith what Serhiy is saying is that asking what should go into the docs is a bugs.python.org topic to discussion. In other words if it isn't already in the pull request then it's discussion on this issue.
@@ -2131,7 +2133,7 @@ def addToZip(zf, path, zippath): |
---|
return count |
if len(files): |
count = 0 |
with ZipFile(zip_name, mode) as zf: |
with ZipFile(zip_name, mode, allowZip64=args.zip64) as zf: |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works without a call to set_defaults?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes @merwok - Because the argument parser is set to store_false on the flag ards.zip64 is automatically set to True unless the flag is seen.
Reviewers
merwok merwok left review comments
drallensmith drallensmith requested changes
1st1 Awaiting requested review from 1st1
ncoghlan Awaiting requested review from ncoghlan
rhettinger Awaiting requested review from rhettinger
gustaebel Awaiting requested review from gustaebel
serhiy-storchaka Awaiting requested review from serhiy-storchaka