[Python-Dev] File system path encoding on Windows (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Sat Aug 20 15:31:27 EDT 2016


On 20 August 2016 at 04:59, Steve Dower <steve.dower at python.org> wrote:

Questions: * should we always use Window's Unicode APIs instead of switching between bytes/Unicode based on parameter type?

Yes

* should we allow users to pass bytes and interpret them as utf-8 rather than letting Windows do the decoding?

Yes (eventually)

* should we do it in 3.6, 3.7 or 3.8?

Reading your summary meant this finally clicked with something Victor has been considering for a while: a "Force UTF-8" switch that told Python to ignore the locale encoding on Linux, and instead assume UTF-8 everywhere (command line parameter parsing, environment variable processing, filesystem encoding, standard streams, etc)

It's essentially the same problem you have on Windows, just with slightly different symptoms and consequences.

Prompted by that realisation, I'd like to suggest an option that didn't come up on python-ideas: add such a flag to Python 3.6, and then actively seek feedback from folks using non-UTF-8 encodings before making a decision on what to do by default in Python 3.7.

This is a really hard problem for people to reason about abstractly, but "try running Python with this new flag, and see if anything breaks" is a much easier question to ask and answer.

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia



More information about the Python-Dev mailing list