msg276189 - (view) |
Author: Christoph Gohlke (cgohlke) |
Date: 2016-09-13 06:55 |
Trying to build numpy-1.11.2rc1 wheels for Python 3.6.0b1 on Windows 10 with `python.exe setup.py bdist_wheel`, python36.dll (32 and 64 bit) segfaults in the `find_maxchar_surrogates` function. Some other packages built OK, e.g. Cython, pyzmq and Pillow. The crash is at <https://hg.python.org/cpython/file/v3.6.0b1/Objects/unicodeobject.c#l1607>. The call stack is attached. Locals: + begin 0x00454c49464f5250 const wchar_t * ch 1509860640 unsigned int + end 0x004550fc90512200 const wchar_t * + iter 0x00000259a500e7d8 L"\x2" const wchar_t * + maxchar 0x0000005a59fea520 {0} unsigned int * + num_surrogates 0x0000005a59fea528 {0} __int64 * |
|
|
msg276194 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-09-13 07:18 |
end-begin = 0x004550fc90512200-0x00454c49464f5250 = 5168087289776 Seems find_maxchar_surrogates() is called with wrong arguments. |
|
|
msg276215 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-13 08:57 |
> Trying to build numpy-1.11.2rc1 wheels for Python 3.6.0b1 on Windows 10 with `python.exe setup.py bdist_wheel`, python36.dll (32 and 64 bit) segfaults in the `find_maxchar_surrogates` function. I ran "python setup.py bdist_wheel" on numpy-1.11.2rc1 on Linux: no issue. I compiled Python 3.6 (the default branch in fact, so the future 3.7) in debug mode/64 bit with SSL, but the setup.py command failed with: Don't know how to compile Fortran code on platform 'nt' I don't want to spend too much time to try to reproduce the issue. Can you please try to get the Python traceback? Try: python -X faulthandler setup.py bdist_wheel It seems like the crash occurs in parse_envlist(), function to parse environment variables in os.spawnve(). Can you try to dump the env parameter of os.spawnve()? For example, you can put the following lines at the beginning of setup.py: --- import os orig_spawnve = os.spawnve def log_spawnve(*args): print("spawnve: %r" % (args,)) return orig_spawnve(*args) os.spawnve = log_spawnve --- My test: --- >>> import os, sys; args=[sys.executable, "-c", "pass"]; os.spawnve(os.P_WAIT, args[0], args, None) spawnve: (0, '/home/haypo/prog/python/default/python', ['/home/haypo/prog/python/default/python', '-c', 'pass'], None) 127 --- |
|
|
msg276234 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2016-09-13 09:57 |
parse_envlist is calling PyUnicode_FromFormat with the format "%U=%U", but passing it bytes: |Debug Error! Program: C:\Program Files\Python36\python_d.exe abort() has been called (Press Retry to debug the application) (1e8.374): Break instruction exception - code 80000003 (first chance) ucrtbased!issue_debug_notification+0x45: 00007ffc`aa1f1985 cc int 3 1:002> k 0n12 Child-SP RetAddr Call Site 00000030`a5fe9bb0 00007ffc`aa1f1b23 ucrtbased!issue_debug_notification+0x45 00000030`a5fe9c00 00007ffc`aa212d7d ucrtbased!__acrt_report_runtime_error+0x13 00000030`a5fe9c60 00007ffc`aa2184cf ucrtbased!abort+0x1d 00000030`a5fe9ca0 00007ffc`aa2164c8 ucrtbased!common_assert_to_stderr<wchar_t>+0xbf 00000030`a5fe9d00 00007ffc`aa218e7f ucrtbased!common_assert<wchar_t>+0x68 00000030`a5fe9d40 00000000`60e59045 ucrtbased!_wassert+0x2f 00000030`a5fe9d70 00000000`60eb9f71 python36_d!_PyUnicode_CheckConsistency+0x45 00000030`a5fe9e00 00000000`60e4cd07 python36_d!unicode_fromformat_arg+0xc81 00000030`a5fe9fa0 00000000`60e4cc31 python36_d!PyUnicode_FromFormatV+0xa7 00000030`a5fea040 00000000`60c69853 python36_d!PyUnicode_FromFormat+0x31 00000030`a5fea080 00000000`60c68286 python36_d!parse_envlist+0x1c3 00000030`a5fea180 00000000`60c5d4ee python36_d!os_spawnve_impl+0x1d6 1:002> .frame 6 06 00000030`a5fe9d70 00000000`60eb9f71 python36_d!_PyUnicode_CheckConsistency+0x45 1:002> ?? op->ob_type->tp_name char * 0x00000000`60fe1248 "bytes" parse_envlist used to call PyUnicode_FSConverter on key and val; get the underlying buffers via PyBytes_AsString; allocate a new buffer and copy key and val into "%s=%s". Now it instead calls PyUnicode_FromFormat for "%U=%U" and then fsconvert_strdup, which has been modified to call PyUnicode_AsWideCharString on Windows and otherwise PyUnicode_FSConverter. For Windows it could use PyUnicode_FSDecoder on key and val in combination with PyUnicode_FromFormat and the format string "%U=%U". For Unix it could use PyUnicode_FSConverter for key and val in combination with PyBytes_FromFormat and the format string "%s=%s". |
|
|
msg276235 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-13 10:04 |
> parse_envlist is calling PyUnicode_FromFormat with the format "%U=%U", but passing it bytes: Ah, it's a regression introduced by the issue #27781 with the change e20c7d8a8187: parse_envlist() was modified to first call PyUnicode_FromFormat("%U=%U") and then convert. |
|
|
msg276241 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2016-09-13 10:43 |
Here's a snippet to reproduce this bug: import os, sys environb = {os.fsencode(k):os.fsencode(v) for k,v in os.environ.items()} os.spawnve(os.P_WAIT, sys.executable, ('python', '--version'), environb) (Now that Windows Python provisionally supports bytes via utf-8:surrogatepass, maybe it should also have an os.environb mapping.) |
|
|
msg276304 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2016-09-13 17:06 |
Whoops, yeah that's my fault. I'll get to it before b2, but I need to catch up on my day job the next week or two, so if someone else wants to apply a fix (and add a test - test_crashers, presumably?) then go for it! |
|
|
msg276388 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2016-09-14 07:28 |
Here's a patch for Unix. I will add Windows support when I get my Windows VM. > [...] (and add a test - test_crashers, presumably?) [...] Unfortunately, test_crashers doesn't run since 2011 (skipped in 481ad9129a0f.) parse_envlist() is only used by os.execve() and os.spawnve() so I'm not sure what's the best way to test the patch (other than adapting Eryk's snippet in .) |
|
|
msg276391 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-14 07:55 |
The issue is not specific to Windows, the following example also crash on Linux: import os, sys args = [sys.executable, "-c", "pass"] os.execve(args[0], args, os.environb) |
|
|
msg276400 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2016-09-14 08:48 |
Berker, this is basically what I had in my initial patch on the Unix side. I also addressed the Windows issues in parse_envlist and fsconvert_strdup. I'm uploading that patch for reference. It needs a test. I also need to verify that there are no additional problems with the high-level os._execvpe function regarding bytes support on Windows. |
|
|
msg276429 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-09-14 12:57 |
New changeset 0ca42273c714 by Victor Stinner in branch '3.6': Issue #28114: Add unit tests on os.spawn*() https://hg.python.org/cpython/rev/0ca42273c714 |
|
|
msg276430 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-14 12:59 |
Berker and/or Eryksun: Please write unit tests for your patch. I just added a new SpawnTests to test_os which tests all os.spawn*() functions. Please add at least one unit test with env contains a bytes key/value entry. Maybe add a use_bytes=False parameter to create_args() to use bytes for the variable added by the test? |
|
|
msg276457 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2016-09-14 14:19 |
Thanks for the spawn test framework, Victor. I've added a use_bytes argument to encode the args and env using os.fsencode. It's encoding args as well because parse_arglist calls fsconvert_strdup, which was assuming Unicode strings on Windows instead of first calling PyUnicode_FSDecoder. test_spawnve_bytes passes for me on Linux and Windows. |
|
|
msg276465 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2016-09-14 15:23 |
Eryk's patch looks good to me, thanks! I will wait for others to review the patch. |
|
|
msg276578 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-09-15 17:19 |
New changeset 8d80a4291ddc by Berker Peksag in branch '3.6': Issue #28114: Fix a crash in parse_envlist() when env contains byte strings https://hg.python.org/cpython/rev/8d80a4291ddc New changeset 8ad99fdc84a3 by Berker Peksag in branch 'default': Issue #28114: Merge from 3.6 https://hg.python.org/cpython/rev/8ad99fdc84a3 |
|
|
msg276581 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2016-09-15 17:48 |
Thanks for the patch, Eryk! |
|
|
msg276593 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-15 19:06 |
> #if defined(HAVE_WEXECV) | |
defined(HAVE_WSPAWNV) Hum, can't we use MS_WINDOWS here? Or maybe pass a parameter in parse_envlist() for Windows? |
|
msg276599 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2016-09-15 19:17 |
We already use HAVE_WEXECV and HAVE_WSPAWNV elsewhere to choose the wide-char versions of these functions. You've got a few changes to make if you want to maintain consistency. Personally I prefer feature flags to platform flags, then let pyconfig.h control them, especially when we're not directly using platform-specific APIs. |
|
|
msg276600 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-15 19:17 |
Ok fine ;-) Let's keep #if defined(HAVE_WEXECV) | |
defined(HAVE_WSPAWNV). |
|