I talked to my colleague. He didn't remember the concrete use-case,     though, he instantly mentioned three possible things (no order of     preference):
    
    1) pathlib + mtime
    2) os.walk and pathlib
    3) creation/removal of paths
    
    He wasn't too sure but I checked with the docs and his memories     seemed to be correct:
    
    
    -----
    
    1) https://docs.python.org/3/library/pathlib.html#pathlib.Path.stat
    
    High-level path objects should return high-level [insert type here]     objects. Put differently, an API for retrieving time-stats as real     date/time objects would be nice. I think that can be expanded to     other pathlib methods as well, to make them less "os-wrapper"-like     and provide added value.
    
    
    -----
    
    2) I remember a discussion on python-ideas about using "glob" or     "rglob". However, when searching the docs for "walk" like in     "os.walk" or for "iter", I don't find "glob"/"rglob". I can imagine     ourselves (pathlib newbies back then), we didn't discover them.
    
    It would be great if the docs could be improved like the following:
    
    """
    Path.rglob(pattern)
    Walk down a given path; a wrapper for "os.scandir"/"os.listdir".     This is like calling glob() with “**” added in front of the given     pattern:
    """
    
    I think it would make "glob" and "rglob" more discoverable to new     users.
    
    NOTE: """               Using the “**” pattern in large directory trees may consume an     inordinate amount of time.""" sounds not really encouraging. This is     especially true for  "rglob" as it is defined as "like calling     glob() with “**”".
    
    That leads to wondering whether "rglob" performs slow globbing     instead of a "os.scandir"/"os.listdir".
    
    https://docs.python.org/3/library/pathlib.html#basic-use even     promotes "glob" with "**" in the beginning which seems rather     discouraging to use "rglob" as a fast alternative to     "os.walk/scandir/listdir". Renaming "rglob"/adding a "scan" method     would definitely help here.
    
    
    -----
    
    3) Again searching the docs for "create", "delete" (nothing found)     and "remove", I found "Path.touch", "Path.rmdir" and "Path.unlink".
    
    It would be great if we had an easy way to remove a complete subtree     as with "shutil.rmtree". We mostly don't care if a directory is     empty. We need the system to be in a state of "this path does not     exist anymore".
    
    Moreover, touching a file is good enough to "create" it if you don't     care about changing its mtime. It you care about its mtime, it's a     problem to "touch".
    
    ------
    
    
    That's it for our issues with pathlib from the past. Additionally, I     got two further observations:
    
    A) pathlib tries to mimic/publish some low-level APIs to its users.     "unlink" is not something people would expect to use when they want     to "delete" or to "remove" a file or a directory. I know where the     term stems from but it's the wrong layer of abstraction IMHO. Same     for "touch" or "chmod".
    
    B) "rename" vs "replace". The difference is not really clear from     the docs. You need to read "Path.replace" in order to understand     "Path.rename" completely. (one raises an exception, the other don't     if I read it correctly).
    
    
    If there's some agreement to change things with respect to those 5     points, I am willing to put some time into it.
    
    
    Best,
    Sven
   ">

(original) (raw)

I talked to my colleague. He didn't remember the concrete use-case, though, he instantly mentioned three possible things (no order of preference):

1) pathlib + mtime
2) os.walk and pathlib
3) creation/removal of paths

He wasn't too sure but I checked with the docs and his memories seemed to be correct:


\-----

1) https://docs.python.org/3/library/pathlib.html#pathlib.Path.stat

High-level path objects should return high-level \[insert type here\] objects. Put differently, an API for retrieving time-stats as real date/time objects would be nice. I think that can be expanded to other pathlib methods as well, to make them less "os-wrapper"-like and provide added value.


\-----

2) I remember a discussion on python-ideas about using "glob" or "rglob". However, when searching the docs for "walk" like in "os.walk" or for "iter", I don't find "glob"/"rglob". I can imagine ourselves (pathlib newbies back then), we didn't discover them.

It would be great if the docs could be improved like the following:

"""
Path.rglob(pattern)
Walk down a given path; a wrapper for "os.scandir"/"os.listdir". This is like calling glob() with “\*\*” added in front of the given pattern:
"""

I think it would make "glob" and "rglob" more discoverable to new users.

NOTE: """ Using the “\*\*” pattern in large directory trees may consume an inordinate amount of time.""" sounds not really encouraging. This is especially true for "rglob" as it is defined as "like calling glob() with “\*\*”".

That leads to wondering whether "rglob" performs slow globbing instead of a "os.scandir"/"os.listdir".

https://docs.python.org/3/library/pathlib.html#basic-use even promotes "glob" with "\*\*" in the beginning which seems rather discouraging to use "rglob" as a fast alternative to "os.walk/scandir/listdir". Renaming "rglob"/adding a "scan" method would definitely help here.


\-----

3) Again searching the docs for "create", "delete" (nothing found) and "remove", I found "Path.touch", "Path.rmdir" and "Path.unlink".

It would be great if we had an easy way to remove a complete subtree as with "shutil.rmtree". We mostly don't care if a directory is empty. We need the system to be in a state of "this path does not exist anymore".

Moreover, touching a file is good enough to "create" it if you don't care about changing its mtime. It you care about its mtime, it's a problem to "touch".

\------


That's it for our issues with pathlib from the past. Additionally, I got two further observations:

A) pathlib tries to mimic/publish some low-level APIs to its users. "unlink" is not something people would expect to use when they want to "delete" or to "remove" a file or a directory. I know where the term stems from but it's the wrong layer of abstraction IMHO. Same for "touch" or "chmod".

B) "rename" vs "replace". The difference is not really clear from the docs. You need to read "Path.replace" in order to understand "Path.rename" completely. (one raises an exception, the other don't if I read it correctly).


If there's some agreement to change things with respect to those 5 points, I am willing to put some time into it.


Best,
Sven