[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for fspath and os.fspath() (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Sat Apr 16 21:28:09 EDT 2016


On 16 April 2016 at 21:21, Stephen J. Turnbull <stephen at xemacs.org> wrote:

Nick Coghlan writes:

> On 15 April 2016 at 00:52, Stephen J. Turnbull <stephen at xemacs.org> wrote: > > Nick Coghlan writes: > > > > > The use case for returning bytes from fspath is DirEntry, so you > > > can write things like this in low level code: > > > > > > def myscandir(dirpath): > > > for entry in os.scandir(dirpath): > > > if entry.isfile(): > > > with open(entry) as f: > > > # do something > > > > Excuse me, but that is not a use case for returning bytes from > > DirEntry.fspath. open() is perfectly happy taking str (including > > surrogate-encoded rawbytes). > > That results in a different type for the file object's name: > > >>> open("README.md").name > 'README.md' > >>> open(b"README.md").name > b'README.md' OK, you win, fspath needs to be polymorphic. But you've just shifted me to -1 on "os.fspath": it's an attractive nuisance. EIBTI, applications and high-level library functions should use os.fsdecode or os.fsencode. Functions that take a polymorphic argument and want preserve type should invoke fspath on the argument. That will visually signal that the caller is not merely low-level, but is explicitly a boundary function.

str and bytes aren't going to implement fspath (since they're only sometimes path objects), so asking people to call the protocol method directly for any purpose would be a pain.

(You could rename the generic function as "os.fspath", I guess, but I really want to deprecate calling the polymorphic version in user code. fspath can be added if experience shows that polymorphic usage is very desireable outside the stdlib. This remark is in my not-so-Dutch opinion, of course.)

You may have missed my email where I agreed os.fspath() itself needs to ensure the output is a str object and throw an exception otherwise. The remaining API design debate relates to whether the polymorphic version should be "os.fspath(obj, allow_bytes=True)" or "os._raw_fspath(obj)" (with Ethan favouring the former, and me the latter).

> The guarantee we want to provide those folks is that if they're > operating in the binary domain they'll stay there. Et tu, Nick? "Guarantee"?! You can't guarantee any such thing with an implicitly invoked polymorphic API like this one -- unless you consider a crashed program to be in the binary domain. ;-)

I do, as one of the core changes in design philosophy between Python 2 and 3 is attempting to remove the implicit level shifting between the binary and text domains, and instead throw exceptions in those cases. Pragmatism requires us to keep some of them (e.g. the codecs module is officially object<->object in both Python 2 and Python 3, and string formatting codes can still do unexpected things), but a great many of them are already gone, and we don't want to add any new ones if alternative designs are available.

Note that the current proposala don't even do that for the binary domain, only for the text domain!

Folks that want to ensure they're working in the binary domain can already do "memoryview(obj)" to ensure they have a bytes-like object without constraining it to a specific type.

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia



More information about the Python-Dev mailing list