Issue 28162: WindowsConsoleIO readall() fails if first line starts with Ctrl+Z (original) (raw)

Created on 2016-09-15 03:25 by eryksun, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (7)

msg276508 - (view)

Author: Eryk Sun (eryksun) * (Python triager)

Date: 2016-09-15 03:25

For a console readall(), if the first line starts with '\x1a' (i.e. Ctrl+Z), it breaks out of its read loop before incrementing len. Thus the input isn't handled properly as EOF, for which the check requires len > 0. Instead it ends up calling WideCharToMultiByte with len == 0, which fails as follows:

>>> sys.stdin.buffer.raw.read()
^Z
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [WinError 87] The parameter is incorrect

msg276841 - (view)

Author: Steve Dower (steve.dower) * (Python committer)

Date: 2016-09-17 20:38

Unfortunately, very hard to test this because it requires interacting with an actual console. But it's an easy fix.

msg276845 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2016-09-17 20:51

New changeset d0bab9fda568 by Steve Dower in branch '3.6': Issue #28161: Opening CON for write access fails https://hg.python.org/cpython/rev/d0bab9fda568

New changeset 187a114b9ef4 by Steve Dower in branch 'default': Issue #28161: Opening CON for write access fails https://hg.python.org/cpython/rev/187a114b9ef4

msg276848 - (view)

Author: Steve Dower (steve.dower) * (Python committer)

Date: 2016-09-17 20:54

The fix here was changing "n > 0 && buf[len] == '\x1a'" into "n == 0 || buf[len] == '\x1a'" when returning an empty bytes. Previously we were falling into the conversion code which wasn't happy with n == 0.

msg277093 - (view)

Author: Eryk Sun (eryksun) * (Python triager)

Date: 2016-09-21 05:46

For breaking out of the readall while loop, you only need to check if the current read is empty:

    /* when the read is empty we break */
    if (n == 0)
        break;

Also, the logic is wrong here:

if (len == 0 || buf[0] == '\x1a' && _buflen(self) == 0) {
    /* when the result starts with ^Z we return an empty buffer */
    PyMem_Free(buf);
    return PyBytes_FromStringAndSize(NULL, 0);
}

This is true when len is 0 or when buf[0] is Ctrl+Z and _buflen(self) is 0. Since buf[0] shouldn't ever be Ctrl+Z here (low-level EOF handling is abstracted in read_console_w), it's never checking the internal buffer. We can easily see this going wrong here:

>>> a = sys.stdin.buffer.raw.read(1); b = sys.stdin.buffer.raw.read()
Ā^Z
>>> a
b'\xc4'
>>> b
b''

It misses the remaining byte in the internal buffer.

This check can be simplified as follows:

rn = _buflen(self);

if (len == 0 && rn == 0) {
    /* return an empty buffer */
    PyMem_Free(buf);
    return PyBytes_FromStringAndSize(NULL, 0);
}

After this the code assumes that len isn't 0, which leads to more WideCharToMultiByte failure cases.

In the last conversion it's overwrite bytes_size without including rn.

I'm not sure what's going on with _PyBytes_Resize(&bytes, n * sizeof(wchar_t)). ISTM, it should be resized to bytes_size, and make sure this includes rn.

Finally, _copyfrombuf is repeatedly overwriting buf[0] instead of writing to buf[n].

With the attached patch, the behavior seems correct now:

>>> sys.stdin.buffer.raw.read()
^Z
b''

>>> sys.stdin.buffer.raw.read()
abc^Z
^Z
b'abc\x1a\r\n'

Split U+0100:

>>> a = sys.stdin.buffer.raw.read(1); b = sys.stdin.buffer.raw.read()
Ā^Z
>>> a
b'\xc4'
>>> b

b'\x80'

Split U+1234:

>>> a = sys.stdin.buffer.raw.read(1); b = sys.stdin.buffer.raw.read()
ሴ^Z
>>> a
b'\xe1'
>>> b
b'\x88\xb4'

The buffer still can't handle splitting an initial non-BMP character, stored as a surrogate pair. Both codes end up as replacement characters because they aren't transcoded as a unit.

Split U+00010000:

>>> a = sys.stdin.buffer.raw.read(1); b = sys.stdin.buffer.raw.read()
𐀀^Z
^Z
>>> a
b'\xef'
>>> b
b'\xbf\xbd\xef\xbf\xbd\x1a\r\n'

msg278320 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2016-10-08 19:38

New changeset 4d4aefa52f49 by Steve Dower in branch '3.6': Issue #28162: Fixes Ctrl+Z handling in console readall() https://hg.python.org/cpython/rev/4d4aefa52f49

New changeset 947fa496ca6f by Steve Dower in branch 'default': Issue #28162: Fixes Ctrl+Z handling in console readall() https://hg.python.org/cpython/rev/947fa496ca6f

msg278321 - (view)

Author: Steve Dower (steve.dower) * (Python committer)

Date: 2016-10-08 19:44

I didn't see your patch, but I made basically the same fix and added a test.

History

Date

User

Action

Args

2022-04-11 14:58:36

admin

set

github: 72349

2017-03-31 16:36:32

dstufft

set

pull_requests: + <pull%5Frequest1057>

2016-10-08 19:44:31

steve.dower

set

status: open -> closed

messages: +

2016-10-08 19:38:08

python-dev

set

messages: +

2016-09-21 05:46:33

eryksun

set

status: closed -> open
files: + issue_28162_01.patch
messages: +

keywords: + patch

2016-09-17 20:54:15

steve.dower

set

status: open -> closed
resolution: fixed
messages: +

stage: needs patch -> resolved

2016-09-17 20:51:44

python-dev

set

nosy: + python-dev
messages: +

2016-09-17 20:38:02

steve.dower

set

assignee: steve.dower
messages: +
stage: test needed -> needs patch

2016-09-15 03:25:17

eryksun

create