Issue 13442: Better support for pipe I/O encoding in subprocess (original) (raw)

Currently, pipes in the subprocess module work strictly with bytes I/O, unless you set "universal newlines=True". In that case, it assumes an output encoding of UTF-8 for stdout and stderr and applies universal newlines process.

When stdin/out/err are remapped to ordinary I/O streams then 'encoding' and 'errors' can be specified as usual, but it is currently challenging to do this for pipes. Since they're created internally by the subprocess module, user code doesn't get the opportunity to wrap them when using the convenience APIs. When using Popen objects, you have to create the object, then wrap each stream individually (rebinding the attributes as you go).

My suggestion is that we add a new option for the stdin/out/err arguments:

class TextPipe:
    def __init__(self, encoding, errors='strict'):
        self.encoding = encoding
        self.errors = errors

So to read UTF-8 encoded data from a subprocess, you could just do:

data = check_stdout(cmd, stdout=TextPipe('utf-8'), stderr=STDOUT)

There are at least a couple of other alternatives here: