[Python-Dev] File system path encoding on Windows (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Sun Aug 28 22:54:35 EDT 2016


On 29 August 2016 at 06:39, <tritium-list at sdamon.com> wrote:

-----Original Message----- From: Python-Dev [mailto:python-dev-bounces+tritium-_ _list=sdamon.com at python.org] On Behalf Of Steve Dower Sent: Wednesday, August 24, 2016 11:44 AM To: Stephen J. Turnbull <turnbull.stephen.fw at u.tsukuba.ac.jp> Cc: Nick Coghlan <ncoghlan at gmail.com>; Python Dev <python-_ _dev at python.org> Subject: Re: [Python-Dev] File system path encoding on Windows

On 23Aug2016 2150, Stephen J. Turnbull wrote: > Steve Dower writes: > > > * Stephen sees "no reason not to change locale.getpreferredencoding()" > > (default encoding for open()) at the same time with the same switches, > > while I'm not quite as confident. Do users generally specify an encoding > > these days? I know I always put utf-8 there. > > I was insufficiently specific. "No reason not to" depends on separate > switches for file system encoding and preferred encoding. That makes > things somewhat more complicated for implementation, and significantly > so for users. Yes, it does, but it's about the only possible migration path. I know Nick and Victor like the idea of a -X flag (or a direct -utf8 flag),

Command line flags and environment variables aren't mutually exclusive

The idea of a "-X" flag is to have an easy way to try out new default settings that say "Ignore the nominal default encoding and just assume UTF-8 everywhere", such that folks can get Python behaving that way, even if they haven't quite figured out how to configure their operating system itself to use those defaults (this is particularly relevant for non-systemd based Linux systems, as other init systems generally don't have the ability to universally override the default POSIX locale, but even systemd based systems can still do the wrong thing if "LANG=C" is explicitly specified instead of "LANG=C.UTF-8", if the particular distro in use doesn't have the latter locale available, or if a client's locale settings have been forwarded to a server SSH session)

but I prefer more specific environment variables:

- PYTHONWINDOWSLEGACYSTDIO (for the console changes) - PYTHONWINDOWSLEGACYPATHENCODING (assuming getfilesystemencoding() is utf8) - PYTHONWINDOWSLEGACYLOCALEENCODING (assuming getpreferredencoding() is utf8) Once you get to var lengths like that, arcane single character flags start looking preferable. How about "PYTHONWINLEGACY" to just turn it all on or off. If the code breaks on one thing, it obviously isn't written to use the other two, so might as well shut them all off.

+1 for a single global on-off switch that means we (and everyone else) only have two configurations to test (the old way and the new way), rather than a combinatorial explosion of 8 or more possible settings. If we get demand for more configurability, then we can increase the complexity, but we shouldn't inflict that level of QA pain on ourselves in the absence of vocal user demand.

If the more fine-grained settings are considered useful for debugging purposes, then they can be added with underscore prefixes and documented accordingly (i.e. as a way of figuring out precisely which aspect of the new defaults is causing a problem, rather than as a permanently supported variant configuration).

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia



More information about the Python-Dev mailing list