array-api icon indicating copy to clipboard operation
array-api copied to clipboard

RFC: clarify `broadcast_to` semantics

Open adityagoel4512 opened this issue 1 year ago • 7 comments

I'm finding the broadcast_to specification a little underspecified. In the docs we see the following for the shape parameter:

shape (Tuple[int, ...]) – array shape. Must be compatible with x (see Broadcasting). If the array is incompatible with the specified shape, the function should raise an exception.

The broadcasting link goes on to specify bidirectional broadcasting. That would imply to me that np.broadcast_to(np.asarray([[-1, -1], [-1, -1]]), (2, 1, 2)) should work since shapes (2, 2) and (2, 1, 2) are bidirectionally compatible. Somewhat reasonably in my opinion, NumPy did not interpret this in that way and raises an exception.

Since np.broadcast_to(np.asarray([[-1, -1], [-1, -1]]), (1, 2, 2)) does work, it seems that broadcasting compatibility is unidirectional. i.e. x.shape must be broadcastable to shape. Is it worth spelling out explicitly the difference in how this works, like ONNX does? I couldn't find any explanation in the standard itself.

It does say the following, although I read the "a specified shape" part as "any shape" rather than simply the shape parameter.

Returns: out (array) – an array having a specified shape. Must have the same data type as x.

If this ambiguity is shared I am happy to contribute a clarification.

adityagoel4512 avatar Jul 17 '24 21:07 adityagoel4512

I agree it should be updated. The unidirectional broadcasting is important for in-place operators (see https://data-apis.org/array-api/latest/API_specification/broadcasting.html#in-place-semantics).

asmeurer avatar Jul 17 '24 21:07 asmeurer

@asmeurer While I'm not aware of a read-only concept in the array API, I just wanted to point out that NumPy's broadcast_to does place that restriction on the returned array: https://numpy.org/doc/stable/reference/generated/numpy.broadcast_to.html#numpy-broadcast-to

cbourjau avatar Jul 18 '24 07:07 cbourjau

Read-only isn't a concept that's in the array API. Not all libraries might implement it. The array API leaves all mutation with views undefined so this isn't an issue.

asmeurer avatar Jul 18 '24 07:07 asmeurer

It is a slight issue, since += can't work on a read-only arrays, but I agree it's niche enough to not worry about it really.

Not sure I like the "bidirectional" term, but happy if Aaron is. It might be good to just say that broadcastable arrays means that there is a common broadcast shape that both arrays can be "broadcast to" (the algorithm describing how to find said broadcast shape). I think I feel "bidirectional" might make you think you could ever shrink a dimension if the values are identical along it, which isn't a concept (and I don't think that needs to be explained anywhere).

seberg avatar Jul 18 '24 09:07 seberg

Agreed that it's a slight issue at the moment - more for NumPy than for the array API standard though. The ideal solution would be something like copy-on-write for NumPy, which could be introduced in a backwards compatible way since += & co now raise an exception.

rgommers avatar Jul 18 '24 09:07 rgommers

more for NumPy than for the array API standard though

I'll have to disagree until I see a clearer plan on how NumPy could introduce CoW (for read-only arrays?) while not generally breaking view semantics in a way that the whole world gets wrong results. (And NumPy isn't the only array library that uses view semantics!)

seberg avatar Jul 18 '24 09:07 seberg

I can't say that I find the names "bidirectional" and "unidirectional" particularly appealing. I personally think of broadcasting as a (non-closed) binary operation on shape tuples. "Unidirectional" broadcasting is a special case where the result of the broadcast has to be the same as the second shape.

asmeurer avatar Jul 18 '24 18:07 asmeurer

I've opened https://github.com/data-apis/array-api/pull/888 to hopefully clarify expected broadcast behavior.

kgryte avatar Jan 23 '25 10:01 kgryte