[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for fspath and os.fspath() (original) (raw)

Stephen J. Turnbull stephen at xemacs.org
Mon Apr 18 15:26:56 EDT 2016


Brett Cannon writes:

If we continue with the "str is an encoding of file paths",

It's not. It's a representation, but not an encoding. In Python 3, encoding means a representation of a character string using bytes. It's using "encoding" generically for "representation" that makes your head hurt.

you can then build from "bytes is an encoding of str" to get a pyramid of file path encodings: Path -> str -> bytes. I don't think this is in any way a controversial view.

Perhaps not. But it's not particularly useful. ;-) Here's the pyramid I think about:

                             Path
                            /    \
                           /      \
                          V        V
                        str <-> bytes

That is, str and bytes are interchangeable without any knowledge of paths, which are on a higher level of complexity and abstraction. Although in pathlib, there's an assumption that paths are serialized to str which is (implicitly) serialized to bytes when talking to the OS, this is not necessarily true for other structured path classes, in particular it is not true for DirEntry (which is a "enhanced degenerate" path containing only one path segment but also other useful information abot the filesystem object addressed)

I haven't looked at Antipathy, but I would guess from Ethan's promotion of bytes paths and concern with efficiency that "bytes antipaths" do not "go through" str to get to bytes, they already are bytes (in the sense of class inheritance).

But that's when I realized that adding fspath support to os.fsdecode() and os.fsencode(), they become more coercion functions rather than encoding/decoding functions. It also means that os.fspath() has a place when you want to say "I only want to encode a file path to str" and avoid the decode bit that os.fsdecode() would do

I don't understand what you're trying to say here. fsdecode currently does not promise to decode anything, because it's polymorphic, accepting str and bytes. fsdecode and fsencode already are coercion functions.

It's this kind of semantic confusion and broken nomenclature that is why I dislike these polymorphic functions and objects so much. It is impossible to reason correctly about them. We're stuck with invoking "practicality" and muddling through. And the names mislead even experienced Pythonistas.

Steve



More information about the Python-Dev mailing list