[Python-Dev] pathlib - current status of discussions (original) (raw)

Victor Stinner victor.stinner at gmail.com
Wed Apr 13 18:19:42 EDT 2016


Oh, since others voted, I will also vote and explain my vote.

I like choice 1, str only, because it's very well defined. In Python 3, Unicode is simply the native type for text. It's accepted by almost all functions. In other emails, I also explained that Unicode is fine to store undecodable filenames on UNIX, it works as expected since many years (since Python 3.3).

--

If you cannot survive without bytes, I suggest to add two functions: one for str only, another which can return str or bytes.

Maybe you want in fact two protocols: fspath(str only) and fspathb (bytes only)? os.fspathb() would first try fspathb, or fallback to os.fsencode(fspath). os.fspath() would first try fspath, or fallback to os.fsdecode(fspathb). IMHO it's not worth to have such complexity while Unicode handles all use cases.

Or do you know functions implemented in Python accepting str and bytes?

--

The C implementation of the os module has an important path_converter() function:

This function will implement something like os.fspath().

With os.fspath() only accepting str, we will return directly the Unicode string on Windows. On UNIX, Unicode will be encoded, as it's already done for Unicode strings.

This specific function would benefit of the flavor 4 (os.fspath() can return str and bytes), but it's more an exception than the rule. I would be more a micro-optimization than a good reason to drive the API design.

Victor

Le mercredi 13 avril 2016, Brett Cannon <brett at python.org> a écrit :

https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has the four potential approaches implemented (although it doesn't follow the "separate functions" approach some are proposing and instead goes with the allowbytes approach I originally proposed).



More information about the Python-Dev mailing list