[Python-Dev] C-level duck typing (original) (raw)

Stefan Behnel stefan_ml at behnel.de
Thu May 17 14:14:23 CEST 2012


Mark Shannon, 17.05.2012 12:38:

Dag Sverre Seljebotn wrote:

On 05/16/2012 10:24 PM, Robert Bradshaw wrote:

On Wed, May 16, 2012 at 11:33 AM, "Martin v. Löwis"<martin at v.loewis.de> wrote:

Does this use case make sense to everyone?

The reason why we are discussing this on python-dev is that we are looking for a general way to expose these C level signatures within the Python ecosystem. And Dag's idea was to expose them as part of the type object, basically as an addition to the current Python level tpcall() slot. The use case makes sense, yet there is also a long-standing solution already to expose APIs and function pointers: the capsule objects. If you want to avoid dictionary lookups on the server side, implement tpgetattro, comparing addresses of interned strings. Yes, that's an idea worth looking at. The point implementing tpgetattro to avoid dictionary lookups overhead is a good one, worth trying at least. One drawback is that this approach does require the GIL (as does PyTypeLookup). Regarding the C function being faster than the dictionary lookup (or at least close enough that the lookup takes time), yes, this happens all the time. For example one might be solving differential equations and the "user input" is essentially a set of (usually simple) double f(double) and its derivatives. To underline how this is performance critical to us, perhaps a full Cython example is useful. The following Cython code is a real world usecase. It is not too contrived in the essentials, although simplified a little bit. For instance undergrad engineering students could pick up Cython just to play with simple scalar functions like this. from numpy import sin # assume sin is a Python callable and that NumPy decides to support # our spec to also support getting a "double (*sinfuncptr)(double)". # Our mission: Avoid to have the user manually import "sin" from C, # but allow just using the NumPy object and still be fast. # define a function to integrate cpdef double f(double x): return sin(x * x) # guess on signature and use "fastcall"! # the integrator def integrate(func, double a, double b, int n): cdef double s = 0 cdef double dx = (b - a) / n for i in range(n): # This is also a fastcall, but can be cached so doesn't # matter... s += func(a + i * dx) return s * dx integrate(f, 0, 1, 1000000) There are two problems here: - The "sin" global can be reassigned (monkey-patched) between each call to "f", no way for "f" to know. Even "sin" could do the reassignment. So you'd need to check for reassignment to do caching... Since Cython allows static typing why not just declare that func can treat sin as if it can't be monkeypatched?

You'd simply say

cdef object sin    # declare it as a C variable of type 'object'
from numpy import sin

That's also the one obvious way to do it in Cython.

Moving the load of a global variable out of the loop does seem to be a rather obvious optimisation, if it were declared to be legal.

My proposal was to simply extract any C function pointers at assignment time, i.e. at import time in the example above. Signature matching can then be done at the first call and the result can be cached as long as the object variable isn't changed. All of that is local to the module and can thus easily be controlled at code generation time.

Stefan



More information about the Python-Dev mailing list