bpo-36974: separate vectorcall functions for each calling convention by jdemeyer · Pull Request #13781 · python/cpython (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation17 Commits4 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

jdemeyer

Implement an optimization which is mentioned in PEP 590: use a specific vectorcall function for each calling convention (METH_O, METH_FASTCALL and so on) to replace the generic _PyCFunction_Vectorcall.

This allows getting rid of _PyMethodDef_RawFastCallKeywords in Python 3.9 (we should keep it in 3.8 since Cython uses it).

CC @encukou @markshannon

https://bugs.python.org/issue36974

@encukou

Thank you!

This will not go into beta1 (we're asked to not push last-minute changes).
Not sure if it'll qualify for beta2, or if it'll have to wait until 3.9. I'll ask the RM once beta1 is out.
Either way it'll need a news entry now.

@jdemeyer

Either way it'll need a news entry now.

Why? It's an implementation detail that doesn't affect any public API or behavior.

Apart from that, do you accept this PR for merging (at least for 3.9)?

@encukou

Either way it'll need a news entry now.

Why? It's an implementation detail that doesn't affect any public API or behavior.

It does change functionality. See the GDB plugin – there are other tools like it that aren't tested in CPython's test suite.

Apart from that, do you accept this PR for merging (at least for 3.9)?

I haven't reviewed it yet, but yes, it looks good at first glance.

@jdemeyer

I haven't reviewed it yet, but yes, it looks good at first glance.

If so, I'll do the analogous change for PyMethodDescr.

@markshannon

Have you benchmarked this change?

@jdemeyer

Have you benchmarked this change?

Using https://github.com/jdemeyer/callbench:

Before:

VARARGS function():             Mean +- std dev: 24.7 ns +- 0.3 ns
FASTCALL function():            Mean +- std dev: 19.5 ns +- 0.4 ns
VARARGS obj.method():           Mean +- std dev: 63.0 ns +- 2.3 ns
FASTCALL obj.method():          Mean +- std dev: 56.3 ns +- 1.5 ns
VARARGS bound method():         Mean +- std dev: 25.6 ns +- 2.7 ns
FASTCALL bound method():        Mean +- std dev: 19.4 ns +- 0.4 ns
VARARGS unbound method(obj, ):  Mean +- std dev: 34.0 ns +- 0.3 ns
FASTCALL unbound method(obj, ): Mean +- std dev: 27.4 ns +- 0.4 ns
tp_call():                      Mean +- std dev: 19.9 ns +- 0.3 ns

After:

VARARGS function():             Mean +- std dev: 25.1 ns +- 1.1 ns
FASTCALL function():            Mean +- std dev: 17.2 ns +- 0.8 ns
VARARGS obj.method():           Mean +- std dev: 58.9 ns +- 0.7 ns
FASTCALL obj.method():          Mean +- std dev: 54.0 ns +- 2.2 ns
VARARGS bound method():         Mean +- std dev: 25.1 ns +- 1.3 ns
FASTCALL bound method():        Mean +- std dev: 16.9 ns +- 0.1 ns
VARARGS unbound method(obj, ):  Mean +- std dev: 29.9 ns +- 0.3 ns
FASTCALL unbound method(obj, ): Mean +- std dev: 25.0 ns +- 0.4 ns
tp_call():                      Mean +- std dev: 19.9 ns +- 0.3 ns

So this is clearly an improvement.

@jdemeyer

Rebased to latest master.

@jdemeyer

@jdemeyer

Rebased to latest master.

@encukou

Two issues with the code:

I made these changes, so I sent a PR to your branch. I didn't yet have time
to polish them; feel free to work change them if you agree.

@jdemeyer

Petr: what's your opinion in general about this PR? Suppose that I merge your changes and polish them, would you then be willing to merge this? I'm asking because other potential reviewers may disagree with your changes.

@encukou

Yes, I'm willing to merge this (though I might take longer than some other reviewers).
If other reviewers disagree, I'd want to hear their thoughts.

@jdemeyer

I made some further fixes and cleanups regarding to Petr's version:

  1. Don't declare vectorcall_FOO as inline. They are called through a function pointer, inlining is meaningless.
  2. Rename vectorcall_FOO -> cfunction_vectorcall_FOO and _PyMethodDescr_Vectorcall_FOO -> method_vectorcall_FOO.
  3. For PyCFunctionObject, use existing macros PyCFunction_GET_FUNCTION and PyCFunction_GET_SELF.
  4. return -1 in case of an exception, to follow the more common CPython convention.
  5. Put _Py_CheckFunctionResult in _PyObject_Vectorcall and PyVectorcall_Call instead of in the callables. Get rid of vectorcall_end function.
  6. Merge vectorcall_begin and get_meth as one function cfunction_enter_call/method_enter_call.
  7. For method descriptors, merge vectorcall_check and check_no_kwargs as one function method_check_args.
  8. Fix reference leaks in METH_VARARGS cases.
  9. Added a testcase for dict.update() to test method_vectorcall_VARARGS_KEYWORDS with keyword arguments. This was not tested before.

@jdemeyer

Rebased to latest master, fixing trivial merge conflict.

@jdemeyer

@jdemeyer jdemeyer deleted the pycfunction_vectorcall branch

July 5, 2019 12:51

@encukou

jdemeyer added a commit to jdemeyer/cpython that referenced this pull request

Jul 15, 2019

@jdemeyer

@jdemeyer

See #14782 for a backport to 3.8

ambv pushed a commit that referenced this pull request

Jul 23, 2019

@jdemeyer @ambv

lisroach pushed a commit to lisroach/cpython that referenced this pull request

Sep 10, 2019

@jdemeyer @lisroach

DinoV pushed a commit to DinoV/cpython that referenced this pull request

Jan 14, 2020

@jdemeyer @DinoV