[Python-ideas] changing sys.stdout encoding (original) (raw)
Guido van Rossum guido at python.org
Wed Jun 13 07:21:45 CEST 2012
- Previous message: [Python-ideas] changing sys.stdout encoding
- Next message: [Python-ideas] changing sys.stdout encoding
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, Jun 12, 2012 at 9:58 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
Oscar Benjamin writes:
> I also think I missed something in this thread. At the beginning of the > original thread it seemed that everyone was agreed that > > writer = codecs.getwriter(desiredencoding) > sys.stdout = writer(sys.stdout.buffer) > > was a reasonable solution (with the caveat that it should happen before any > output is written). Is there some reason why this is not a good > approach? It's undocumented and unobvious, but it's needed for standard stream filtering in some environments -- where a lot of coding is done by people who otherwise never need to understand streams at anything but a superficial level -- and the analogous case of a newly opened file, pipe, or socket is documented and obvious, and usable by novices. It's damn shame that we can't say the same about the stdin, stdout, and stderr streams (even if I too have been at pains to explain why that's hard to fix).
I'm probably missing something, but in all my naivete I have what feels like a simple solution, and I can't seem to see what's wrong with it.
In C there used to be a function to set the buffer size on an open stream that could only be called when the stream hadn't been used yet. ISTM the OP's use case would be covered by a similar function on an open TextIOWrapper to set the encoding that can only be used when it hasn't been used to write (or read) anything yet? When called under any other circumstances it should raise an error. The TextIOWrapper should maintain a "used" flag so that it can raise this exception reliably.
This ought to work for stdin and stdout when used at the start of the program, assuming nothing is written by code run before main starts. (This should normally be fine, otherwise you couldn't use a Python program as a filter at all.) It won't work for stderr if connected to a tty-ish device (since the version stuff is written there) but that should be okay, and it should still be okay with stderr if it's not a tty, since then it starts silent. (But I don't think the use case is very strong for stderr anyway.)
I'm not sure about a name, but it might well be called set_encoding(). The error message when misused should clarify to people who misunderstand the name that it can only be called when the stream hasn't been used yet; I don't think it's necessary to encode that information in the name. (C's setbuf() wasn't called set_buffer_on_virgin_stream() either. :-)
I don't care about the integrity of the underlying binary stream. It's a binary stream, you can write whatever bytes you want to it. But if a TextIOWrapper is used properly, it won't write a mixture of encodings to the underlying binary stream, since you can only set the encoding before reading/writing a single byte. (And the TextIOWrapper is careful not to use the binary stream before the first actual read() or write() call -- it just tries to calls tell(), if it's seekable, which should be safe.)
-- --Guido van Rossum (python.org/~guido)
- Previous message: [Python-ideas] changing sys.stdout encoding
- Next message: [Python-ideas] changing sys.stdout encoding
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]