[Numpy-discussion] Allowing broadcasting of code dimensions in generalized ufuncs (original) (raw)

Marten van Kerkwijk m.h.vankerkwijk at gmail.com
Fri Jun 1 17:41:18 EDT 2018


Hi Nathaniel,

On Matt's prompting, I added release notes to the frozen/flexible PR [1]; see text attached below.

Having done that, I felt the examples actually justified the frozen dimensions quite well. Given that you're the who expressed most doubts about them, could you have a look? Ideally, I'd avoid having to write a NEP for this, and the examples do seem to make it quite obvious that this change to the signature is the way to go, as its meaning is dead obvious. And the implementation is super-straightforward...

For the broadcasted core dimensions, I do agree the case is less strong and the meaning perhaps less obvious (implementation is relatively simple), and I think a short NEP may be called for (unless others on the list have super-convincing use cases...). I will add here, though, that even if we implement all_equal as a method on equal, it would still be useful to have a signature that can actually describe it.

-- Marten

[1] https://github.com/numpy/numpy/pull/11175/files

Generalized ufunc signatures now allow fixed-size dimensions

By using a numerical value in the signature of a generalized ufunc, one can indicate that the given function requires input or output to have dimensions with the given size. E.g., the signature of a function that converts a polar angle to a two-dimensional cartesian unit vector would be ()->(2); that for one that converts two spherical angles to a three-dimensional unit vector would be (),()->(3); and that for the cross product of two three-dimensional vectors would be (3),(3)->(3).

Note that to the elementary function these dimensions are not treated any differently from variable ones indicated with a letter; the loop still is passed the corresponding size, but it can now count on that being equal to the fixed size given in the signature.

Generalized ufunc signatures now allow flexible dimensions

Some functions, in particular numpy's implementation of @ as matmul, are very similar to generalized ufuncs in that they operate over core dimensions, but one could not present them as such because they were able to deal with inputs in which a dimension is missing. To support this, it is now allowed to postfix a dimension name with a question mark to indicate that that dimension does not necessarily have to be present.

With this addition, the signature for matmul can be expressed as (m?,n),(n,p?)->(m?,p?). This indicates that if, e.g., the second operand has only one dimension, for the purposes of the elementary function it will be treated as if that input has core shape (n, 1), and the output has the corresponding core shape of (m, 1). The actual output array, however, has flexible dimension removed, i.e., it will have shape (..., n). Similarly, if both arguments have only a single dimension, the inputs will be presented as having shapes (1, n) and (n, 1) to the elementary function, and the output as (1, 1), while the actual output array returned will have shape (). In this way, the signature thus allows one to use a single elementary function for four related but different signatures, (m,n),(n,p)->(m,p), (n),(n,p)->(p), (m,n),(n)->(m) and (n),(n)->(). -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180601/4b166abd/attachment-0001.html>



More information about the NumPy-Discussion mailing list