Issue 28114: parse_envlist(): os.execve(), os.spawnve(), etc. crash in Python 3.6.0 when env contains byte strings (original) (raw)

process

Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: berker.peksag, cgohlke, eryksun, ezio.melotti, paul.moore, python-dev, serhiy.storchaka, steve.dower, tim.golden, vstinner, zach.ware
Priority: normal Keywords: patch

Created on 2016-09-13 06:55 by cgohlke, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
callstack.txt cgohlke,2016-09-13 06:55
issue28114_unix.diff berker.peksag,2016-09-14 07:28 review
issue_28114_01.patch eryksun,2016-09-14 08:48 review
issue_28114_02.patch eryksun,2016-09-14 14:19 review
issue_28114_03.patch eryksun,2016-09-14 15:12 review
Pull Requests
URL Status Linked Edit
PR 552 closed dstufft,2017-03-31 16:36
Messages (19)
msg276189 - (view) Author: Christoph Gohlke (cgohlke) Date: 2016-09-13 06:55
Trying to build numpy-1.11.2rc1 wheels for Python 3.6.0b1 on Windows 10 with `python.exe setup.py bdist_wheel`, python36.dll (32 and 64 bit) segfaults in the `find_maxchar_surrogates` function. Some other packages built OK, e.g. Cython, pyzmq and Pillow. The crash is at <https://hg.python.org/cpython/file/v3.6.0b1/Objects/unicodeobject.c#l1607>. The call stack is attached. Locals: + begin 0x00454c49464f5250 const wchar_t * ch 1509860640 unsigned int + end 0x004550fc90512200 const wchar_t * + iter 0x00000259a500e7d8 L"\x2" const wchar_t * + maxchar 0x0000005a59fea520 {0} unsigned int * + num_surrogates 0x0000005a59fea528 {0} __int64 *
msg276194 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-09-13 07:18
end-begin = 0x004550fc90512200-0x00454c49464f5250 = 5168087289776 Seems find_maxchar_surrogates() is called with wrong arguments.
msg276215 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-09-13 08:57
> Trying to build numpy-1.11.2rc1 wheels for Python 3.6.0b1 on Windows 10 with `python.exe setup.py bdist_wheel`, python36.dll (32 and 64 bit) segfaults in the `find_maxchar_surrogates` function. I ran "python setup.py bdist_wheel" on numpy-1.11.2rc1 on Linux: no issue. I compiled Python 3.6 (the default branch in fact, so the future 3.7) in debug mode/64 bit with SSL, but the setup.py command failed with: Don't know how to compile Fortran code on platform 'nt' I don't want to spend too much time to try to reproduce the issue. Can you please try to get the Python traceback? Try: python -X faulthandler setup.py bdist_wheel It seems like the crash occurs in parse_envlist(), function to parse environment variables in os.spawnve(). Can you try to dump the env parameter of os.spawnve()? For example, you can put the following lines at the beginning of setup.py: --- import os orig_spawnve = os.spawnve def log_spawnve(*args): print("spawnve: %r" % (args,)) return orig_spawnve(*args) os.spawnve = log_spawnve --- My test: --- >>> import os, sys; args=[sys.executable, "-c", "pass"]; os.spawnve(os.P_WAIT, args[0], args, None) spawnve: (0, '/home/haypo/prog/python/default/python', ['/home/haypo/prog/python/default/python', '-c', 'pass'], None) 127 ---
msg276234 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2016-09-13 09:57
parse_envlist is calling PyUnicode_FromFormat with the format "%U=%U", but passing it bytes: |Debug Error! Program: C:\Program Files\Python36\python_d.exe abort() has been called (Press Retry to debug the application) (1e8.374): Break instruction exception - code 80000003 (first chance) ucrtbased!issue_debug_notification+0x45: 00007ffc`aa1f1985 cc int 3 1:002> k 0n12 Child-SP RetAddr Call Site 00000030`a5fe9bb0 00007ffc`aa1f1b23 ucrtbased!issue_debug_notification+0x45 00000030`a5fe9c00 00007ffc`aa212d7d ucrtbased!__acrt_report_runtime_error+0x13 00000030`a5fe9c60 00007ffc`aa2184cf ucrtbased!abort+0x1d 00000030`a5fe9ca0 00007ffc`aa2164c8 ucrtbased!common_assert_to_stderr<wchar_t>+0xbf 00000030`a5fe9d00 00007ffc`aa218e7f ucrtbased!common_assert<wchar_t>+0x68 00000030`a5fe9d40 00000000`60e59045 ucrtbased!_wassert+0x2f 00000030`a5fe9d70 00000000`60eb9f71 python36_d!_PyUnicode_CheckConsistency+0x45 00000030`a5fe9e00 00000000`60e4cd07 python36_d!unicode_fromformat_arg+0xc81 00000030`a5fe9fa0 00000000`60e4cc31 python36_d!PyUnicode_FromFormatV+0xa7 00000030`a5fea040 00000000`60c69853 python36_d!PyUnicode_FromFormat+0x31 00000030`a5fea080 00000000`60c68286 python36_d!parse_envlist+0x1c3 00000030`a5fea180 00000000`60c5d4ee python36_d!os_spawnve_impl+0x1d6 1:002> .frame 6 06 00000030`a5fe9d70 00000000`60eb9f71 python36_d!_PyUnicode_CheckConsistency+0x45 1:002> ?? op->ob_type->tp_name char * 0x00000000`60fe1248 "bytes" parse_envlist used to call PyUnicode_FSConverter on key and val; get the underlying buffers via PyBytes_AsString; allocate a new buffer and copy key and val into "%s=%s". Now it instead calls PyUnicode_FromFormat for "%U=%U" and then fsconvert_strdup, which has been modified to call PyUnicode_AsWideCharString on Windows and otherwise PyUnicode_FSConverter. For Windows it could use PyUnicode_FSDecoder on key and val in combination with PyUnicode_FromFormat and the format string "%U=%U". For Unix it could use PyUnicode_FSConverter for key and val in combination with PyBytes_FromFormat and the format string "%s=%s".
msg276235 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-09-13 10:04
> parse_envlist is calling PyUnicode_FromFormat with the format "%U=%U", but passing it bytes: Ah, it's a regression introduced by the issue #27781 with the change e20c7d8a8187: parse_envlist() was modified to first call PyUnicode_FromFormat("%U=%U") and then convert.
msg276241 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2016-09-13 10:43
Here's a snippet to reproduce this bug: import os, sys environb = {os.fsencode(k):os.fsencode(v) for k,v in os.environ.items()} os.spawnve(os.P_WAIT, sys.executable, ('python', '--version'), environb) (Now that Windows Python provisionally supports bytes via utf-8:surrogatepass, maybe it should also have an os.environb mapping.)
msg276304 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2016-09-13 17:06
Whoops, yeah that's my fault. I'll get to it before b2, but I need to catch up on my day job the next week or two, so if someone else wants to apply a fix (and add a test - test_crashers, presumably?) then go for it!
msg276388 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-09-14 07:28
Here's a patch for Unix. I will add Windows support when I get my Windows VM. > [...] (and add a test - test_crashers, presumably?) [...] Unfortunately, test_crashers doesn't run since 2011 (skipped in 481ad9129a0f.) parse_envlist() is only used by os.execve() and os.spawnve() so I'm not sure what's the best way to test the patch (other than adapting Eryk's snippet in .)
msg276391 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-09-14 07:55
The issue is not specific to Windows, the following example also crash on Linux: import os, sys args = [sys.executable, "-c", "pass"] os.execve(args[0], args, os.environb)
msg276400 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2016-09-14 08:48
Berker, this is basically what I had in my initial patch on the Unix side. I also addressed the Windows issues in parse_envlist and fsconvert_strdup. I'm uploading that patch for reference. It needs a test. I also need to verify that there are no additional problems with the high-level os._execvpe function regarding bytes support on Windows.
msg276429 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-09-14 12:57
New changeset 0ca42273c714 by Victor Stinner in branch '3.6': Issue #28114: Add unit tests on os.spawn*() https://hg.python.org/cpython/rev/0ca42273c714
msg276430 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-09-14 12:59
Berker and/or Eryksun: Please write unit tests for your patch. I just added a new SpawnTests to test_os which tests all os.spawn*() functions. Please add at least one unit test with env contains a bytes key/value entry. Maybe add a use_bytes=False parameter to create_args() to use bytes for the variable added by the test?
msg276457 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2016-09-14 14:19
Thanks for the spawn test framework, Victor. I've added a use_bytes argument to encode the args and env using os.fsencode. It's encoding args as well because parse_arglist calls fsconvert_strdup, which was assuming Unicode strings on Windows instead of first calling PyUnicode_FSDecoder. test_spawnve_bytes passes for me on Linux and Windows.
msg276465 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-09-14 15:23
Eryk's patch looks good to me, thanks! I will wait for others to review the patch.
msg276578 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-09-15 17:19
New changeset 8d80a4291ddc by Berker Peksag in branch '3.6': Issue #28114: Fix a crash in parse_envlist() when env contains byte strings https://hg.python.org/cpython/rev/8d80a4291ddc New changeset 8ad99fdc84a3 by Berker Peksag in branch 'default': Issue #28114: Merge from 3.6 https://hg.python.org/cpython/rev/8ad99fdc84a3
msg276581 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-09-15 17:48
Thanks for the patch, Eryk!
msg276593 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-09-15 19:06
> #if defined(HAVE_WEXECV) | defined(HAVE_WSPAWNV) Hum, can't we use MS_WINDOWS here? Or maybe pass a parameter in parse_envlist() for Windows?
msg276599 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2016-09-15 19:17
We already use HAVE_WEXECV and HAVE_WSPAWNV elsewhere to choose the wide-char versions of these functions. You've got a few changes to make if you want to maintain consistency. Personally I prefer feature flags to platform flags, then let pyconfig.h control them, especially when we're not directly using platform-specific APIs.
msg276600 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-09-15 19:17
Ok fine ;-) Let's keep #if defined(HAVE_WEXECV) | defined(HAVE_WSPAWNV).
History
Date User Action Args
2022-04-11 14:58:36 admin set github: 72301
2017-03-31 16:36:10 dstufft set pull_requests: + <pull%5Frequest861>
2016-09-15 19:17:47 vstinner set messages: +
2016-09-15 19:17:06 steve.dower set messages: +
2016-09-15 19:06:12 vstinner set messages: +
2016-09-15 17:48:30 berker.peksag set status: open -> closedresolution: fixedmessages: + stage: patch review -> resolved
2016-09-15 17:19:44 python-dev set messages: +
2016-09-14 15:23:02 berker.peksag set messages: +
2016-09-14 15:12:45 eryksun set files: + issue_28114_03.patch
2016-09-14 14:19:27 eryksun set files: + issue_28114_02.patchmessages: +
2016-09-14 12:59:57 vstinner set messages: +
2016-09-14 12:57:56 python-dev set nosy: + python-devmessages: +
2016-09-14 08:48:10 eryksun set files: + issue_28114_01.patchmessages: +
2016-09-14 07:55:03 vstinner set messages: +
2016-09-14 07:53:14 vstinner set title: Crash in unicodeobject.c find_maxchar_surrogates on python-3.6.0b1 for Windows -> parse_envlist(): os.execve(), os.spawnve(), etc. crash in Python 3.6.0 when env contains byte strings
2016-09-14 07:28:11 berker.peksag set files: + issue28114_unix.diffnosy: + berker.peksagmessages: + keywords: + patchstage: patch review
2016-09-13 17:06:04 steve.dower set messages: +
2016-09-13 10:43:16 eryksun set messages: +
2016-09-13 10:04:15 vstinner set messages: +
2016-09-13 09:57:58 eryksun set nosy: + eryksunmessages: +
2016-09-13 08:57:24 vstinner set messages: +
2016-09-13 07🔞07 serhiy.storchaka set nosy: + serhiy.storchakamessages: + versions: + Python 3.7
2016-09-13 06:55:35 cgohlke create