Issue 2663: shutil.copytree glob-style filtering [patch] (original) (raw)

Created on 2008-04-20 21:28 by tarek, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
copytree.patch tarek,2008-05-23 21:29
Messages (16)
msg65652 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-04-20 21:28
Here's a first draft of a small addon to shutil.copytree. This patch allows excluding some folders or files from the copy, given glob-style patterns. A callable can also be provided instead of the patterns, for a more complex filtering. I didn't upgrade Doc/shutil.rst yet in this patch, as this can be done when the change will be accepted and in its final shape I guess.
msg65663 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-21 13:41
On the interface, I would suggest renaming 'exclude' to 'ignore' for consistency with filecmp.dircmp. Also consider detecting file separator in the patterns and interpreting them as an absolute (if pattern.startswith(pathsep)) or relative with respect to src. On the implementation, consider making 'exclude_files' a set for a faster lookup. It should also be possible to refactor the code to avoid checking the type of 'exclude' on every file and every recursion.
msg65670 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-04-22 07:39
I changed the patch based on all remarks. For the absolute path, I was wondering if it would be useful since calls are recursive, relative to the visited directory.
msg65682 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2008-04-22 19:56
Is there any reason for rmtree also to not support this exclusion feature? Both copytree and rmtree explicitly iterate over list of names and as I see it, this exclusion is really about which names to ignore. Already, copytree and rmtree have inconsistencies (rmtree has 'onerror' while 'copytree' doesn't) and it would be nice to not add more.
msg65906 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-04-27 23:00
Agreed, rmtree should have it as well. I'll add that in the patch as well,
msg65907 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-04-27 23:16
while working on the patch to add the same feature in rmtree, I realized this is a non sense since the root folder itself is removed at the end of the function when all its content is removed. So, unless we change this behavior, which I doubt it is a good idea, it won't be possible. Maybe another API could be added in shutil, in order to do any kind of treatment in a tree, like removing files, or whatever, and without copying it like copytree does.
msg65908 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-04-27 23:32
I have thaught of various ways to write this new API for the deletion use case, but I think nothing makes it easier and shorter than a simple os.walk call.
msg65909 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-04-28 00:17
This patch includes the documentation for shutils.rst as well. (I removed the older patches)
msg65919 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2008-04-28 14:12
My update with email failed so I am just copying my response here: > while working on the patch to add the same feature in rmtree, I realized > this is a non sense since the root folder itself is removed at the end > of the function when all its content is removed. Indeed. Sorry about that. > So, unless we change this behavior, which I doubt it is a good idea, it > won't be possible. I agree. But in general, it would be nice to separate file list generation and the actual operation. Something similar to shell where it resolves the pattern while the actual command itself cares only about the files passed to it. This is not necessarily a comment on this patch which I am hoping I can check it out soon.
msg65924 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2008-04-28 17:49
The patch looks good to me.
msg65925 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2008-04-28 17:53
I forgot to add that the example provided in rst doc is incorrect. The copytree() in that example should be given destination path as well. In addition, the docstring for copytree mentions "which is a directory list". "directory list" is a bit vague and should ideally be replaced by something like "list of elements" (which is what appears in the doc) or "list of entries".
msg66149 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-05-03 08:33
Right, thanks. I have corrected the doc, and pushed some examples at the bottom of the module documentation.
msg67158 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-05-21 15:04
patch with the new trunk
msg67225 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-05-23 10:02
Hi Tarek, here's a review: * The new docs are not very clear about ignore_patterns being a function factory. E.g.: """The callable must return a list of folder and file names relative to the path, that will be ignored in the copy process. :func:`ignore_patterns` is an example of such callable.""" Rather, the *return value* of ignore_patterns is an example of such a callable. * The new docs should also note that copytree is called recursively, and therefore the ignore callable will be called once for each directory that is copied. * Instead of "path and its elements" the terminology should be "directory" and "the list of its contents, as returned by os.listdir()". Likewise, "folder" should be "directory". * The second new example makes me wonder if *ignore* is the correct name for the parameter. Is *filter* better? * A nit; the signature should be "copytree(src, dst[, symlinks[, ignore]])". * The patch adds a space in the definition of rmtree().
msg67269 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-05-23 21:29
Thanks Georg, I have changed the patch accordingly. There's one issue left: the name of the parameter (ignore) I have renamed it like this on Alexander suggestion, for consistency with filecmp.dircmp which uses ignore. By the way, I was wondering: do we need to used reStructuredText as well in function doctstrings ?
msg69276 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-07-05 10:13
Committed in r64722. Thanks everyone!
History
Date User Action Args
2022-04-11 14:56:33 admin set github: 46915
2008-07-05 10:13:53 georg.brandl set status: open -> closedresolution: fixedmessages: +
2008-06-07 19:30:15 giampaolo.rodola set nosy: + giampaolo.rodola
2008-05-23 21:30:40 tarek set files: - copytree2.patch
2008-05-23 21:30:06 tarek set files: + copytree.patchmessages: +
2008-05-23 10:03:09 georg.brandl set nosy: + georg.brandlmessages: +
2008-05-21 15:05:45 tarek set files: - copytree.patch
2008-05-21 15:05:25 tarek set files: + copytree2.patchmessages: +
2008-05-03 08:38:15 tarek set files: - copytree.patch
2008-05-03 08:38:09 tarek set files: + copytree.patch
2008-05-03 08:34:58 tarek set files: - copytree.patch
2008-05-03 08:34:51 tarek set files: + copytree.patch
2008-05-03 08:33:29 tarek set files: - shutil.copytree.patch
2008-05-03 08:33:18 tarek set files: + copytree.patchmessages: +
2008-04-28 17:53:29 draghuram set messages: +
2008-04-28 17:49:18 draghuram set messages: +
2008-04-28 14:12:04 draghuram set messages: +
2008-04-28 00:17:36 tarek set files: - shutil.copytree.filtering.patch
2008-04-28 00:17:31 tarek set files: - shutil.copytree.filtering.patch
2008-04-28 00:17:15 tarek set files: + shutil.copytree.patchmessages: +
2008-04-27 23:32:34 tarek set messages: +
2008-04-27 23:16:52 tarek set messages: +
2008-04-27 23:00:05 tarek set messages: +
2008-04-22 19:56:34 draghuram set messages: +
2008-04-22 07:40:10 tarek set files: + shutil.copytree.filtering.patchmessages: +
2008-04-21 13:41:59 belopolsky set nosy: + belopolskymessages: +
2008-04-21 10:51:10 draghuram set nosy: + draghuram
2008-04-20 22:27:24 gustavo set nosy: + gustavo
2008-04-20 21:28:22 tarek create