[Python-ideas] changing sys.stdout encoding (original) (raw)

Stephen J. Turnbull stephen at xemacs.org
Wed Jun 6 05:28:57 CEST 2012


Amaury Forgeot d'Arc writes:

2012/6/5 Stephen J. Turnbull <stephen at xemacs.org>

I wouldn't object to a method with the semantics of reinitialization, but it should have a name implying reinitialization. It probably should also error if the stream is open and has been written to.

What do you think of the following method TextIOWrapper.reset_encoding? (the assert statements should certainly be replaced by some IOError)

I think that it's an attractive nuisance because it doesn't close the stream, and therefore permits changing the encoding without any warning partway through the stream. There are two reasonable (for a very generous definition of "reasonable") ways to handle multiple scripts in one stream: Unicode and ISO 2022. Simply changing encodings in the middle is a recipe for disaster in the absence of a higher-level protocol for signaling this change (that's the role ISO 2022 fulfils, but it is detested by almost everybody...). If you want to do that kind of thing, the "import codecs; sys.stdout = ..." idiom is available, but I don't see a need to make it convenient.

But the OP's request is pretty clearly not for a generic .set_encoding(), it's for a more convenient way to initialize the stream for users.

Aside to Victor: at least on Mac OS X, I find that Python 3.2 (current MacPorts, I can investigate further if you need it) doesn't respect the language environment as I would expect it to. "LC_ALL=ja_JP.UTF8 python32" will give me an out-of-range Unicode error if I try to input Japanese using "import sys; sys.stdin.readline()" -- I have to use "PYTHONIOENCODING=UTF8" to get useful behavior.

There may also be cases where multiple users with different language needs are working at the same workstation.

For both of these cases a command-line option to initialize the encoding would be convenient.



More information about the Python-ideas mailing list