(original) (raw)

On Wed, 13 Apr 2016 at 15:20 Victor Stinner <victor.stinner@gmail.com> wrote:

Oh, since others voted, I will also vote and explain my vote.

I like choice 1, str only, because it's very well defined. In Python
3, Unicode is simply the native type for text. It's accepted by almost
all functions. In other emails, I also explained that Unicode is fine
to store undecodable filenames on UNIX, it works as expected since
many years (since Python 3.3).

\--

If you cannot survive without bytes, I suggest to add two functions:
one for str only, another which can return str or bytes.

Maybe you want in fact two protocols: \_\_fspath\_\_(str only) and
\_\_fspathb\_\_ (bytes only)? os.fspathb() would first try \_\_fspathb\_\_, or
fallback to os.fsencode(\_\_fspath\_\_). os.fspath() would first try
\_\_fspath\_\_, or fallback to os.fsdecode(\_\_fspathb\_\_). IMHO it's not
worth to have such complexity while Unicode handles all use cases.

Implementing two magic methods for this seems like overkill. Best I would be willing to do with automatic encode/decode is use os.fsencode()/os.fsdecode() on the argument or what \_\_fspath\_\_() returned.

Or do you know functions implemented in Python accepting str \*and\* bytes?

On purpose, nothing off the top of my head.

\--

The C implementation of the os module has an important
path\_converter() function:

\* path\_converter accepts (Unicode) strings and their
\* subclasses, and bytes and their subclasses. What
\* it does with the argument depends on the platform:
\*
\* \* On Windows, if we get a (Unicode) string we
\* extract the wchar\_t \* and return it; if we get
\* bytes we extract the char \* and return that.
\*
\* \* On all other platforms, strings are encoded
\* to bytes using PyUnicode\_FSConverter, then we
\* extract the char \* from the bytes object and
\* return that.

This function will implement something like os.fspath().

With os.fspath() only accepting str, we will return directly the
Unicode string on Windows. On UNIX, Unicode will be encoded, as it's
already done for Unicode strings.

This specific function would benefit of the flavor 4 (os.fspath() can
return str and bytes), but it's more an exception than the rule. I
would be more a micro-optimization than a good reason to drive the API
design.

Yep, it's interesting to know but Chris and I won't let it drive the decision (I assume).

-Brett

Victor

Le mercredi 13 avril 2016, Brett Cannon <brett@python.org> a écrit :
\>
\> https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has the four potential approaches implemented (although it doesn't follow the "separate functions" approach some are proposing and instead goes with the allow\_bytes approach I originally proposed).