bpo-36974: separate vectorcall functions for each calling convention by jdemeyer · Pull Request #13781 · python/cpython (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation17 Commits4 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Implement an optimization which is mentioned in PEP 590: use a specific vectorcall function for each calling convention (METH_O
, METH_FASTCALL
and so on) to replace the generic _PyCFunction_Vectorcall
.
This allows getting rid of _PyMethodDef_RawFastCallKeywords
in Python 3.9 (we should keep it in 3.8 since Cython uses it).
https://bugs.python.org/issue36974
Thank you!
This will not go into beta1 (we're asked to not push last-minute changes).
Not sure if it'll qualify for beta2, or if it'll have to wait until 3.9. I'll ask the RM once beta1 is out.
Either way it'll need a news entry now.
Either way it'll need a news entry now.
Why? It's an implementation detail that doesn't affect any public API or behavior.
Apart from that, do you accept this PR for merging (at least for 3.9)?
Either way it'll need a news entry now.
Why? It's an implementation detail that doesn't affect any public API or behavior.
It does change functionality. See the GDB plugin – there are other tools like it that aren't tested in CPython's test suite.
Apart from that, do you accept this PR for merging (at least for 3.9)?
I haven't reviewed it yet, but yes, it looks good at first glance.
I haven't reviewed it yet, but yes, it looks good at first glance.
If so, I'll do the analogous change for PyMethodDescr
.
Have you benchmarked this change?
Have you benchmarked this change?
Using https://github.com/jdemeyer/callbench:
Before:
VARARGS function(): Mean +- std dev: 24.7 ns +- 0.3 ns
FASTCALL function(): Mean +- std dev: 19.5 ns +- 0.4 ns
VARARGS obj.method(): Mean +- std dev: 63.0 ns +- 2.3 ns
FASTCALL obj.method(): Mean +- std dev: 56.3 ns +- 1.5 ns
VARARGS bound method(): Mean +- std dev: 25.6 ns +- 2.7 ns
FASTCALL bound method(): Mean +- std dev: 19.4 ns +- 0.4 ns
VARARGS unbound method(obj, ): Mean +- std dev: 34.0 ns +- 0.3 ns
FASTCALL unbound method(obj, ): Mean +- std dev: 27.4 ns +- 0.4 ns
tp_call(): Mean +- std dev: 19.9 ns +- 0.3 ns
After:
VARARGS function(): Mean +- std dev: 25.1 ns +- 1.1 ns
FASTCALL function(): Mean +- std dev: 17.2 ns +- 0.8 ns
VARARGS obj.method(): Mean +- std dev: 58.9 ns +- 0.7 ns
FASTCALL obj.method(): Mean +- std dev: 54.0 ns +- 2.2 ns
VARARGS bound method(): Mean +- std dev: 25.1 ns +- 1.3 ns
FASTCALL bound method(): Mean +- std dev: 16.9 ns +- 0.1 ns
VARARGS unbound method(obj, ): Mean +- std dev: 29.9 ns +- 0.3 ns
FASTCALL unbound method(obj, ): Mean +- std dev: 25.0 ns +- 0.4 ns
tp_call(): Mean +- std dev: 19.9 ns +- 0.3 ns
So this is clearly an improvement.
Rebased to latest master.
Rebased to latest master.
Two issues with the code:
- PyCFunction's vectorcall implementations are only used in methodobject.c,
so IMO they should be moved there and made static. - The big macros make debugging harder. Can we do without magically defined
variables and return from macros?
I'd prefer somewhat longer, but more straightforward code.
I made these changes, so I sent a PR to your branch. I didn't yet have time
to polish them; feel free to work change them if you agree.
Petr: what's your opinion in general about this PR? Suppose that I merge your changes and polish them, would you then be willing to merge this? I'm asking because other potential reviewers may disagree with your changes.
Yes, I'm willing to merge this (though I might take longer than some other reviewers).
If other reviewers disagree, I'd want to hear their thoughts.
I made some further fixes and cleanups regarding to Petr's version:
- Don't declare
vectorcall_FOO
asinline
. They are called through a function pointer, inlining is meaningless. - Rename
vectorcall_FOO
->cfunction_vectorcall_FOO
and_PyMethodDescr_Vectorcall_FOO
->method_vectorcall_FOO
. - For
PyCFunctionObject
, use existing macrosPyCFunction_GET_FUNCTION
andPyCFunction_GET_SELF
. return -1
in case of an exception, to follow the more common CPython convention.- Put
_Py_CheckFunctionResult
in_PyObject_Vectorcall
andPyVectorcall_Call
instead of in the callables. Get rid ofvectorcall_end
function. - Merge
vectorcall_begin
andget_meth
as one functioncfunction_enter_call
/method_enter_call
. - For method descriptors, merge
vectorcall_check
andcheck_no_kwargs
as one functionmethod_check_args
. - Fix reference leaks in
METH_VARARGS
cases. - Added a testcase for
dict.update()
to testmethod_vectorcall_VARARGS_KEYWORDS
with keyword arguments. This was not tested before.
Rebased to latest master, fixing trivial merge conflict.
jdemeyer deleted the pycfunction_vectorcall branch
jdemeyer added a commit to jdemeyer/cpython that referenced this pull request
See #14782 for a backport to 3.8
ambv pushed a commit that referenced this pull request
lisroach pushed a commit to lisroach/cpython that referenced this pull request
DinoV pushed a commit to DinoV/cpython that referenced this pull request