[Python-Dev] pathlib - current status of discussions (original) (raw)
Brett Cannon brett at python.org
Wed Apr 13 19:09:57 EDT 2016
- Previous message (by thread): [Python-Dev] pathlib - current status of discussions
- Next message (by thread): [Python-Dev] pathlib - current status of discussions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 13 Apr 2016 at 15:20 Victor Stinner <victor.stinner at gmail.com> wrote:
Oh, since others voted, I will also vote and explain my vote.
I like choice 1, str only, because it's very well defined. In Python 3, Unicode is simply the native type for text. It's accepted by almost all functions. In other emails, I also explained that Unicode is fine to store undecodable filenames on UNIX, it works as expected since many years (since Python 3.3). -- If you cannot survive without bytes, I suggest to add two functions: one for str only, another which can return str or bytes. Maybe you want in fact two protocols: fspath(str only) and fspathb (bytes only)? os.fspathb() would first try fspathb, or fallback to os.fsencode(fspath). os.fspath() would first try fspath, or fallback to os.fsdecode(fspathb). IMHO it's not worth to have such complexity while Unicode handles all use cases.
Implementing two magic methods for this seems like overkill. Best I would be willing to do with automatic encode/decode is use os.fsencode()/os.fsdecode() on the argument or what fspath() returned.
Or do you know functions implemented in Python accepting str and bytes?
On purpose, nothing off the top of my head.
-- The C implementation of the os module has an important pathconverter() function: * pathconverter accepts (Unicode) strings and their * subclasses, and bytes and their subclasses. What * it does with the argument depends on the platform: * * * On Windows, if we get a (Unicode) string we * extract the wchart * and return it; if we get * bytes we extract the char * and return that. * * * On all other platforms, strings are encoded * to bytes using PyUnicodeFSConverter, then we * extract the char * from the bytes object and * return that. This function will implement something like os.fspath(). With os.fspath() only accepting str, we will return directly the Unicode string on Windows. On UNIX, Unicode will be encoded, as it's already done for Unicode strings. This specific function would benefit of the flavor 4 (os.fspath() can return str and bytes), but it's more an exception than the rule. I would be more a micro-optimization than a good reason to drive the API design.
Yep, it's interesting to know but Chris and I won't let it drive the decision (I assume).
-Brett
Victor Le mercredi 13 avril 2016, Brett Cannon <brett at python.org> a écrit : > > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has the four potential approaches implemented (although it doesn't follow the "separate functions" approach some are proposing and instead goes with the allowbytes approach I originally proposed). -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20160413/818e2cf4/attachment.html>
- Previous message (by thread): [Python-Dev] pathlib - current status of discussions
- Next message (by thread): [Python-Dev] pathlib - current status of discussions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]