Issue 34421: Cannot install package with unicode module names on Windows (original) (raw)

Created on 2018-08-17 17:06 by julien.malard, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 8799 merged julien.malard,2018-08-17 17:10
PR 9117 merged miss-islington,2018-09-08 20:32
PR 9118 merged miss-islington,2018-09-08 20:32
PR 9126 merged serhiy.storchaka,2018-09-09 14:15
PR 9503 closed miss-islington,2018-09-23 06:13
PR 9504 closed miss-islington,2018-09-23 06:13
PR 9506 merged serhiy.storchaka,2018-09-23 06:51
PR 9510 merged miss-islington,2018-09-23 07:32
Messages (14)
msg323699 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-08-18 10:32
Please provide more details. How to reproduce your issue? What you got, and what you expect to get? Seems the code just before lines modified by your PR are purposed to solve this issue. Why it doesn't work?
msg323714 - (view) Author: Julien Malard (julien.malard) * Date: 2018-08-18 14:55
Hello, Yes, it does seem odd that that code does not work. On my Windows machine (WIndows 7, 64 bits, running 32-bit Python) I checked and it seems that the code in the if block immediately preceding my PR does not run at all, whereby the error. For a reproducible example, my Taqdir package, mostly consisting of unicode packages and modules, runs into this issue (and installs successfully after my proposed fix here combined with a separate PR in pip). Perhaps the most easily accessible example would be the Appveyor build (https://ci.appveyor.com/project/julienmalard/Tinamit) for my TInamit project, which has Taqdir as a dependency. Thanks! -Julien Malard ________________________________ દ્વારા: Serhiy Storchaka <report@bugs.python.org> મોકલ્યું: 18 ઑગસ્ટ 2018 06:32 પ્રતિ: Julien Malard વિષય: [] Cannot install package with unicode module names on Windows New submission from Serhiy Storchaka <storchaka+cpython@gmail.com>: Please provide more details. How to reproduce your issue? What you got, and what you expect to get? Seems the code just before lines modified by your PR are purposed to solve this issue. Why it doesn't work? ---------- nosy: +serhiy.storchaka _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue34421> _______________________________________
msg324861 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2018-09-08 20:31
New changeset 0afada163c7ef25c3a9d46ed445481fb69f2ecaf by Éric Araujo (Julien Malard) in branch 'master': bpo-34421 avoid unicode error in distutils logging (GH-8799) https://github.com/python/cpython/commit/0afada163c7ef25c3a9d46ed445481fb69f2ecaf
msg324862 - (view) Author: miss-islington (miss-islington) Date: 2018-09-08 20:44
New changeset 3b36642924a51e6bceb7033916c3049764817166 by Miss Islington (bot) in branch '3.6': bpo-34421 avoid unicode error in distutils logging (GH-8799) https://github.com/python/cpython/commit/3b36642924a51e6bceb7033916c3049764817166
msg324863 - (view) Author: miss-islington (miss-islington) Date: 2018-09-08 20:53
New changeset 77b92b15a5e5c84b91d3fd9d02f63db432fa8903 by Miss Islington (bot) in branch '3.7': bpo-34421 avoid unicode error in distutils logging (GH-8799) https://github.com/python/cpython/commit/77b92b15a5e5c84b91d3fd9d02f63db432fa8903
msg324874 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-09-09 07:06
I would prefer to use the backslashreplace error handler rather of the unicode-escape codec. Just as few lines above, but with ASCII encoding. msg = msg.encode('ascii', 'backslashreplace').decode('ascii') It is still not clear to me why the current code purposed to handle this problem doesn't work in this case. We need to find the cause and fix the existing solution.
msg324878 - (view) Author: Jeremy Kloth (jkloth) * Date: 2018-09-09 09:47
The existing re-code solution is being triggered, as the `errors` in this case is 'surrogateescape' with an encoding of 'cp1252'. Here, pip is using subprocess.Popen() to have Python run setup.py. During execution, a filename, 'taqdir\\\u0634\u0645\u0627\u0631.py', which has characters not encodable in cp1252. I think that here, Python is not configuring its stdin/stdout/stderr streams correctly when run as a subprocess connected to pipes. Or, at least, subprocess.Popen() isn't passing the right (or enough) information to Python to get itself configured. There should ultimately be a way to have Python (in a subprocess, on Windows) pass through Unicode untouched to its calling process. I suppose it would mean setting the PYTHONIOENCODING envvar when using subprocess. After all that, it seems that: 1) pip needs to be changed to support calling Python subprocesses to enable lossless unicode transmission, 2) change the `errors` check in distutils.log to include 'surrogateescape'? (the heart of this issue)
msg324888 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-09-09 14:25
PR 9126 makes distutils.log using "backslashreplace" instead of "unicode-escape" and simplifies the code (it is more efficient now, although the performance of logging is not critical). "unicode-escape" escapes all non-ASCII characters, even encodable. It also escapes control characters like \t, \b, \r or \x1a (which starts control sequences for ANSI compatible terminals), this can be not desirable.
msg324889 - (view) Author: Julien Malard (julien.malard) * Date: 2018-09-09 14:46
Hello, Thanks for the insights and better fixes. Regarding (1), do you have any pointers on how or where to fix pip? I have an inprogress pull request there (https://github.com/pypa/pip/pull/5712) to fix a related unicode error during installation and could perhaps combine both solutions. Thanks! -Julien
msg324907 - (view) Author: Jeremy Kloth (jkloth) * Date: 2018-09-10 02:57
For pip, in call_subprocess() (given here in rough pseudo-code) is_python = (cmd[0] == sys.executable) kwds = {} if is_python: env['PYTHONIOENCODING'] = 'utf8' kwds['encoding'] = 'utf8' proc = Popen(..., **kwds) . . . if stdout is not None: while True: line = proc.stdout.readline() # When running Python, the output is already Unicode if not is_python: line = console_to_str(line) if not line: break Hopefully, there is enough context to figure out the exact placement.
msg324921 - (view) Author: Julien Malard (julien.malard) * Date: 2018-09-10 12:26
Thanks! Will give it a try and reference this conversation here as background.
msg326137 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-09-23 06:13
New changeset 4b860fd777e983f5d2a6bd1288e2b53099c6a803 by Serhiy Storchaka in branch 'master': bpo-34421: Improve distutils logging for non-ASCII strings. (GH-9126) https://github.com/python/cpython/commit/4b860fd777e983f5d2a6bd1288e2b53099c6a803
msg326143 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-09-23 07:31
New changeset c73df53569f86d0c7742bafa55958c53d57a02e4 by Serhiy Storchaka in branch '3.7': bpo-34421: Improve distutils logging for non-ASCII strings. (GH-9126) (GH-9506) https://github.com/python/cpython/commit/c73df53569f86d0c7742bafa55958c53d57a02e4
msg326145 - (view) Author: miss-islington (miss-islington) Date: 2018-09-23 07:54
New changeset 0b67995bfa45393585e2e0017c82c706c4a04b04 by Miss Islington (bot) in branch '3.6': bpo-34421: Improve distutils logging for non-ASCII strings. (GH-9126) (GH-9506) https://github.com/python/cpython/commit/0b67995bfa45393585e2e0017c82c706c4a04b04
History
Date User Action Args
2022-04-11 14:59:04 admin set github: 78602
2018-09-23 11:11:42 serhiy.storchaka set status: open -> closedstage: patch review -> resolved
2018-09-23 07:54:02 miss-islington set messages: +
2018-09-23 07:32:07 miss-islington set pull_requests: + <pull%5Frequest8916>
2018-09-23 07:31:56 serhiy.storchaka set messages: +
2018-09-23 06:51:00 serhiy.storchaka set pull_requests: + <pull%5Frequest8913>
2018-09-23 06:13:23 miss-islington set pull_requests: + <pull%5Frequest8911>
2018-09-23 06:13:14 miss-islington set stage: commit review -> patch reviewpull_requests: + <pull%5Frequest8910>
2018-09-23 06:13:03 serhiy.storchaka set messages: +
2018-09-10 12:26:50 julien.malard set messages: +
2018-09-10 02:57:57 jkloth set messages: +
2018-09-09 14:46:55 julien.malard set messages: +
2018-09-09 14:25:50 serhiy.storchaka set messages: + stage: patch review -> commit review
2018-09-09 14:15:47 serhiy.storchaka set stage: commit review -> patch reviewpull_requests: + <pull%5Frequest8579>
2018-09-09 09:47:56 jkloth set nosy: + jklothmessages: +
2018-09-09 07:06:29 serhiy.storchaka set messages: +
2018-09-08 20:53:02 miss-islington set messages: +
2018-09-08 20:44:23 miss-islington set status: pending -> opennosy: + miss-islingtonmessages: +
2018-09-08 20:35:11 eric.araujo set status: open -> pendingversions: + Python 3.7, Python 3.8resolution: fixedassignee: eric.araujotype: crash -> behaviorstage: patch review -> commit review
2018-09-08 20:32:13 miss-islington set pull_requests: + <pull%5Frequest8571>
2018-09-08 20:32:05 miss-islington set pull_requests: + <pull%5Frequest8570>
2018-09-08 20:31:29 eric.araujo set messages: +
2018-08-18 14:55:28 julien.malard set messages: +
2018-08-18 10:32:21 serhiy.storchaka set nosy: + serhiy.storchakamessages: +
2018-08-17 17:10:09 julien.malard set keywords: + patchstage: patch reviewpull_requests: + <pull%5Frequest8274>
2018-08-17 17:06:15 julien.malard create