(original) (raw)

On 20 April 2016 at 13:16, Stephen J. Turnbull <stephen@xemacs.org> wrote:

It's people who live in monolingual mono-encoding environments who
will be using bytes successfully, and be resistent to costly changes
that don't make their lives better. But the bytes vs. text cost is
inherent in using pathlib, so polymorphism doesn't help promote
pathlib. It might help promote use of os.scandir in bytes-oriented
code, though I don't see that as a huge effect nor more than mildly
desirable. Is it?

Some of us are also interested in optimised network service development use cases where UTF-8 already rules the world \[1\]. It's a vastly different domain from desktop computing, and different even from traditional stateful servers where the same instance may be kept running for years.

When "absolutely everything is UTF-8, and your system boundaries are policed accordingly" is a valid assumption, then writing bytes level network code is a far more viable option than when you're writing software to give to other people to run in arbitrary environments (that's how Go is able to get away with its "all system boundaries use UTF-8" approach - if you're not prepared to meet that precondition, you don't choose to use Go in the first place).

I think this is also why we're talking past each other - as a default, I completely agree it makes sense to present a "str-only" API (that's where my proposed fspath/\_raw\_fspath split came from). However, there really are contexts where "our text is always stored as bytes, those bytes are always UTF-8 encoded, and our software only needs to work on \*nix systems" is a reasonable approach, and those are the domains where being \*able\* to stay entirely in the binary domain is actually a desirable characteristic, rather than merely a tool for migrating from Python 2.

Cheers,

Nick.

\[1\] http://utf8everywhere.org/

Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia