[Python-Dev] pathlib - current status of discussions (original) (raw)
Koos Zevenhoven k7hoven at gmail.com
Thu Apr 14 13:56:54 EDT 2016
- Previous message (by thread): [Python-Dev] pathlib - current status of discussions
- Next message (by thread): [Python-Dev] pathlib - current status of discussions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, Apr 14, 2016 at 7:46 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
What many folks seem to be missing is that you (generic you) have control of your data. If you are not working at the bytes layer, you shouldn't be getting bytes objects because: - you specified str when asking for data from the OS, or - you transformed the incoming bytes from whatever external source to str when you received them.
There is an apparent contradiction of the above with some previous posts, including your own. Let me try to fix it:
Code that deals with paths can be divided in groups as follows:
(1) Code that has access to pathname/filename data and has some level of control over what data type comes in. This code may for instance choose to deal with either bytes or str
(2) Code that takes the path or file name that it happens to get and does something with it. This type of code can be divided into subgroups as follows:
(2a) Code that accepts only one type of paths (e.g. str, bytes or pathlib) and fails if it gets something else.
(2b) Code that wants to support different types of paths such as str, bytes or pathlib objects. This includes os.path.*, os.scandir, and various other standard library code. Presumably there is also third-party code that does the same. These functions may want to preserve the str-ness or bytes-ness of the paths in case they return paths, as the stdlib now does. But new code may even want to return pathlib objects when they get such objects as inputs. This is the duck-typing or polymorphic code we have been talking about. Code of this type (2b) may want to avoid implicit conversions because it makes the life of code of the other types more difficult.
(feel free to fill in more categories of code)
So the code of type (2b) is trying to make all categories happy by returning objects of the same type that it gets as input, while the other categories are probably in the situation where they don't necessarily need to make other categories of code happy.
And the question is this: Do we need to make code using both bytes and scandir happy? This is largely the same question as whether we have to support bytes in addition to str in the protocol.
(We may of course talk about third-party path libraries that have the same problem as scandir's DirEntry. Ethan's library is not exactly in the same category as DirEntry since its path objects are instances of bytes or str and therefore do not need this protocol to begin with, except perhaps for conversions from other high-level path types so that different path libraries work together nicely).
-Koos
- Previous message (by thread): [Python-Dev] pathlib - current status of discussions
- Next message (by thread): [Python-Dev] pathlib - current status of discussions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]