[Python-Dev] PEP 540: Add a new UTF-8 mode (v2) (original) (raw)

Greg Ewing greg.ewing at canterbury.ac.nz
Fri Dec 8 00:20:49 EST 2017


Victor Stinner wrote:

Users don't use stdin and stdout as regular files, they are more used as pipes to pass data between programs with the Unix pipe in a shell like "producer | consumer". Sometimes stdout is redirected to a file, but I consider that it is expected to behave as a pipe and the regular TTY stdout.

It seems weird to me to make a distinction between stdin/stdout connected to a file and accessing the file some other way.

It would be surprising, for example, if the following two commands behaved differently with respect to encoding:

cat foo | sort

cat < foo | sort

But Naoki explained that open() is commonly misused to open binary files and Python should somehow fail badly to notify the developer of their mistake.

Maybe if you explicitly open the file in text mode it should default to surrogateescape, but use strict if text mode is being used by default?

I.e.

open("foo", "rt") --> surrogateescape
open("foo")       --> strict

That way you can easily open a file in a way that's compatible with the way stdin/stdout behave, but you will get bitten if you mistakenly open a binary file as text.

-- Greg



More information about the Python-Dev mailing list