Issue 28858: Fastcall uses more C stack (original) (raw)

Created on 2016-12-02 07:54 by vstinner, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
stack_overflow.py vstinner,2016-12-02 07:54
stack_overflow.py serhiy.storchaka,2016-12-04 22:49
Messages (10)
msg282228 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-02 07:54
Serhiy Storchaka reported that Python 3.6 crashs earlier than Python 3.5 on calling json.dumps() when sys.setrecursionlimit() is increased. I tested the script he wrote. Results on Python built in release mode: Python 3.7: ... 58100 116204 Segmentation fault (core dumped) Python 3.6: ... 74800 149604 Segmentation fault (core dumped) Python 3.5: ... 74700 149404 Segmentation fault (core dumped) Oh, it seems like Python 3.7 does crash earlier. But to be clear, it's hard to control the usage of the C stack.
msg282247 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-02 16:10
Oh, I didn't understand that the regression was introduced by the revision b9c9691c72c5. The purpose of this revision was to *reduce* the memory usage of the C stack!? It seems like _PyObject_CallArg1() uses more stack memory than PyObject_CallFunctionObjArgs(). PyObject_CallFunctionObjArgs() allocates 4O bytes (5*sizeof(PyObject*)) on the stack. At least, I can say that when the crash occurs, _PyObject_FastCallDict() is not the gdb backtrace.
msg282252 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-02 19:53
Yes, that is why I asked you to revert your changes. In additional, they introduced compiler warnings.
msg282370 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-12-04 22:00
New changeset d35fc6e58a70 by Victor Stinner in branch 'default': Backed out changeset b9c9691c72c5 https://hg.python.org/cpython/rev/d35fc6e58a70
msg282371 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-04 22:13
> Serhiy Storchaka reported that Python 3.6 crashs earlier than Python 3.5 on calling json.dumps() when sys.setrecursionlimit() is increased. Reference: http://bugs.python.org/issue23507#msg282190 (issue #23507). Serhiy Storchaka: "Yes, that is why I asked you to revert your changes." Sorry, I misunderstood your comments. So yes, my change b9c9691c72c5 introduced a regression. Sorry, I didn't have time before now to revert my change. I just pushed the change d35fc6e58a70 which reverts b9c9691c72c5. The question is how replacing PyObject_CallFunctionObjArgs() with _PyObject_CallArg1() increases the usage of the C stack. I wrote my change to reduce the usage of the C stack. PyObject_CallFunctionObjArgs() allocates 5 "PyObject *", so 40 bytes, on the C stack. Maybe using _PyObject_CallArg1() increases the usage of C stack in the *caller*. > In additional, they introduced compiler warnings. This one was fixed by Benjamin Peterson in the issue #28855 (change 96245d4af0ca).
msg282373 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-04 22:49
Thanks Victor. Following script includes several examples of achieving a stack overflow (most are real world examples or can be used in real world examples). It measures maximal deep for every type of recursion.
msg282377 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-04 23:11
When I wrote the _PyObject_CallArg1(), it looks as a cool hack: #define _PyObject_CallArg1(func, arg) \ _PyObject_FastCall((func), (PyObject **)&(arg), 1) It hacks the declaration of an explicit "stack" like: PyObject *stack[1]; stack[0] = arg; res = _PyObject_FastCall(func, stack, 1); And I expected that the C compiler magically computes the memory address of the argument. But it seems like requesting the memory address of an argument allocates something on the C stack. On x86_64, first function arguments are passed with CPU registers. Maybe requesting the memory address of an argument requires to allocate a local variable, copy the register into the variable, to get the address of the local variable? So, I suggest to *remove* the _PyObject_CallArg1() macro, and use existing functions like PyObject_CallFunctionObjArgs(). What do you think Serhiy?
msg282385 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-05 06:06
That was my initial preference. Mainly because this doesn't add code churn. But I don't understand how PyObject_CallFunctionObjArgs() that uses _PyObject_CallArg1() and has many local variables can consume less stack than _PyObject_CallArg1() itself.
msg282426 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-12-05 16:13
New changeset 4171cc26272c by Victor Stinner in branch 'default': Issue #28858: Remove _PyObject_CallArg1() macro https://hg.python.org/cpython/rev/4171cc26272c
msg282447 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-05 17:45
The changeset 4171cc26272c "Remove _PyObject_CallArg1() macro" fixed the initial bug report, so I now close the issue. Serhiy: If you see further enhancements, please open a new issue. Your second stack_overflow.py script is interesting, but I don't see any obvious possible changes to enhance results. Thanks Serhiy for digging into this issue ;-)
History
Date User Action Args
2022-04-11 14:58:40 admin set github: 73044
2016-12-05 17:45:34 vstinner set status: open -> closedresolution: fixedmessages: +
2016-12-05 16:13:21 python-dev set messages: +
2016-12-05 06:06:17 serhiy.storchaka set messages: +
2016-12-04 23:11:19 vstinner set messages: +
2016-12-04 22:49:30 serhiy.storchaka set files: + stack_overflow.pymessages: +
2016-12-04 22:13:53 vstinner set messages: +
2016-12-04 22:00:17 python-dev set nosy: + python-devmessages: +
2016-12-02 19:53:47 serhiy.storchaka set messages: +
2016-12-02 16:10:15 vstinner set nosy: + serhiy.storchakamessages: +
2016-12-02 07:54:27 vstinner create