[Numpy-discussion] NEP: Dispatch Mechanism for NumPy’s high level API (original) (raw)
Marten van Kerkwijk m.h.vankerkwijk at gmail.com
Sun Jun 3 19:23:58 EDT 2018
- Previous message (by thread): [Numpy-discussion] NEP: Dispatch Mechanism for NumPy’s high level API
- Next message (by thread): [Numpy-discussion] NEP: Dispatch Mechanism for NumPy’s high level API
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
In most cases, I suspect that the overhead of a function call and checking several arguments for "arrayfunction" will be negligible, like the situation for arrayufunc. I'm not strongly opposed to either of your proposed solutions, but I do think it would be a little strange to insist that we need a solution for arrayfunction when arrayufunc was fine.
Ufuncs actually do try to speed-up array checks - but indeed the same can
(and should) be done for __array_ufunc__
. They also do have subok
. This
currently ignored but that is mostly because looking for it in kwargs
is
so damn slow!
Anyway, my main point was that it should be explicitly mentioned as a constraint that for pure ndarray input, things should be really fast.
A. Two "namespaces", one for the undecorated base functions, and one completely trivial one for the decorated ones. The idea would be that if one knows one is dealing with arrays only, one would do
import_ _numpy.arrayonly as np
(i.e., the reverse of the suggestion currently in the NEP, where the decorated ones are in their own namespace - I agree with the reasons for discounting that one).I will mention this as a possibility. I do think there is something to be said for clear separation of overloaded and non-overloaded APIs. But f I were to choose between adding numpy.api and numpy.arrayonly, I would pick numpy.api, because of the virtue of preserving the existing numpy namespace as it currently exists.
Good point. Overall, the separate namespaces probably is not the way to do.
B. Automatic insertion by the decorator of an
arrayonly=np.NoValue
(orcoerce
and perhapssubok=...
if not present) in the function signature, so that users who know that they have arrays only could passarrayonly=True
(name to be decided).Rather than adding another argument to every NumPy function, I would rather encourage writing np.asarray() explicitly.
Good point - just as good as long as the check for all-array is very fast
(which it should be - arg.__class__ is np.ndarray
is fast!).
Note that both A and B could also address, at least partially, the problem
of sometimes wanting to just use the old coercion methods, i.e., not having to implement every possible numpy function in one go in a new
_arrayfunction_
on one's class.Yes, agreed. 1. I'm rather unclear about the use of
types
. It can help me decide what to do, but I would still have to find the argument in question (e.g., for Quantity, the unit of the relevant argument). I'd recommend passing instead a tuple of all arguments that were inspected, in the inspection order; after all, it is just aarg._class_
away from the type, and in your example you'd only have to replaceissubclass
byisinstance
. The virtue of atypes
argument is that we can deduplicate arguments once, rather than in each arrayfunction check. This could result in significantly more efficient code, e.g,. when np.concatenate() is called on 10,000 arrays with only two unique types, we don't need to loop through all 10,000 again objects to check that overloading is valid.
I think one might still want to know where the type occurs (e.g., as an output or index would have different implications). Possibly, a solution would rely on the same structure as used for the "dance". But as a general point, I don't see the advantage of passing types rather than arguments - less information for no benefit.
Even for Quantity, I suspect you will want two layers of checks: 1. A check to verify that every argument is a Quantity (or something coercible to a Quantity). This could use
types
and returnNotImplemented
when it fails. 2. A check to verify that units match. This will have custom logic for different operations and will require checking all arguments -- not just their unique types.
Not sure. With, Quantity I generally do not worry about other types, but
rather look at units attributes, assume anything without is dimensionless,
cast Quantity to array with the right unit, and then defer to ndarray
.
For many Quantity functions, the second check will indeed probably be super simple (i.e., verifying that all units match). But the first check (with
types
) really is something that basically very overload should do.2. For subclasses, it would be very handy to have
ndarray._arrayfunction_
, so one can call super after changing arguments. (For_arrayufunc_
, there was lots of question about whether this was useful, but it really is!!). [I think you already agreed with this, but want to have it in-place, as for subclasses of ndarray this is just as useful as it would be for subclasses of dask arrays.) Yes, indeed.
NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180603/55f6e218/attachment.html>
- Previous message (by thread): [Numpy-discussion] NEP: Dispatch Mechanism for NumPy’s high level API
- Next message (by thread): [Numpy-discussion] NEP: Dispatch Mechanism for NumPy’s high level API
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]