msg277821 - (view) |
Author: Adam Bartoš (Drekin) * |
Date: 2016-10-01 15:20 |
In my setting (Python 3.6b1 on Windows), trying to prompt a non-ASCII character via input() results in mojibake. This is related to the recent fix of #1602 and so is Windows-specific. >>> input("α") ╬▒ The result corresponds to print("α".encode("utf-8").decode("cp852")). That cp852 the default terminal encoding in my locale. |
|
|
msg278274 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2016-10-07 21:51 |
Same output with cp437. |
|
|
msg278275 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2016-10-07 21:52 |
This is a regression from 3.5.2, where input("α") displays "α". |
|
|
msg278277 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2016-10-07 22:08 |
This may force into 3.6 - we really ought to be getting and using sys.stdin and sys.stderr in PyOS_StdioReadline() rather than going directly to the raw streams. The problem here is that we're still using fprintf to output the prompt, even though we know (assume) the input is utf-8. I haven't looked closely at how safely we can use Python objects from this code, except to see that it's not obviously safe, but we should really figure out how to deal in Python str rather than C char* for the default readline implementation (and then only fall back on the GNU protocol when someone asks for it). The faster fix here would be to decode the prompt from utf-8 to utf-16-le in PyOS_StdioReadline and then write it using a wide-char output function. |
|
|
msg278281 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2016-10-07 22:32 |
When I pointed this issue out in code reviews, I assumed you would add the relatively simple fix to decode the prompt and call WriteConsoleW. The long-term fix in issue 17620 has to be worked out with cross-platform support, and ISTM that it can wait for 3.7. Off topic: I just noticed that you're not calling PyOS_InputHook in the new PyOS_StdioReadline code. Tkinter registers this function pointer to call its EventHook. Do you want a separate issue for this, or is there a reason its was omitted? |
|
|
msg278284 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2016-10-08 01:01 |
I'm sure Steve already has this covered, but FWIW here's a patch to call WriteConsoleW. Here's the result with the patch applied: >>> sys.ps1 = '»»» ' »»» input("αβψδ: ") αβψδ: spam 'spam' and with interactive stdin and stdout/stderr redirected to a file: >set PYTHONIOENCODING=utf-8 >amd64\python_d.exe >out.txt 2>&1 input("αβψδ: ") spam ^Z >chcp 65001 Active code page: 65001 >type out.txt Python 3.6.0b1+ (default, Oct 7 2016, 23:47:58) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> αβψδ: 'spam' >>> If it can't write the prompt for some reason (e.g. out of memory, decoding fails, WriteConsole fails), it doesn't fall back on fprintf to write the prompt. Should it? This should also get a test that calls ReadConsoleOutputCharacter to verify that the correct prompt is written. |
|
|
msg278317 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-10-08 19:19 |
New changeset faf5493e6f61 by Steve Dower in branch '3.6': Issue #28333: Enables Unicode for ps1/ps2 and input() prompts. (Patch by Eryk Sun) https://hg.python.org/cpython/rev/faf5493e6f61 New changeset cb62e921bd06 by Steve Dower in branch 'default': Issue #28333: Enables Unicode for ps1/ps2 and input() prompts. (Patch by Eryk Sun) https://hg.python.org/cpython/rev/cb62e921bd06 |
|
|
msg278318 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-10-08 19:21 |
New changeset 63ceadf8410f by Steve Dower in branch '3.6': Issue #28333: Remove unnecessary increment. https://hg.python.org/cpython/rev/63ceadf8410f New changeset d76c8f9ea787 by Steve Dower in branch 'default': Issue #28333: Remove unnecessary increment. https://hg.python.org/cpython/rev/d76c8f9ea787 |
|
|
msg278319 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2016-10-08 19:23 |
I made some minor tweaks to the patch (no need for strlen() - passing -1 works equivalently), but otherwise it's exactly what I would have done so I committed it. We currently have no tests to check which characters are written to a console output buffer. Issue28217 was tracking those, but considering how little code we have on top of output I don't think it's worth blocking anything on automating those tests. |
|
|
msg278624 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2016-10-13 23:04 |
MultibyteToWideChar includes the trailing NUL when it gets the string length, so the WriteConsoleW call needs to use (wlen - 1). |
|
|
msg279427 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2016-10-25 17:50 |
Not sure how I missed it originally, but that extra 1 char is actually very important: Python 3.6.0b2 (v3.6.0b2:b9fadc7d1c3f, Oct 10 2016, 20:36:51) [MSC v.1900 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.ps1='> ' > sys The extra space is because of that. Really ought to fix this before the next beta. |
|
|
msg279435 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2016-10-25 18:08 |
I forgot to include the link to the python-list thread where this came up: https://mail.python.org/pipermail/python-list/2016-October/715428.html |
|
|
msg279445 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-10-25 18:52 |
New changeset 6b46c3deea2c by Steve Dower in branch '3.6': Issue #28333: Fixes off-by-one error that was adding an extra space. https://hg.python.org/cpython/rev/6b46c3deea2c New changeset 44d15ba67d2e by Steve Dower in branch 'default': Issue #28333: Fixes off-by-one error that was adding an extra space. https://hg.python.org/cpython/rev/44d15ba67d2e |
|
|