Broadcasting rule allows broadcasting 0d array to empty 1d array
Here is an implementation of the broadcasting algorithm from the spec:
import operator
def broadcasted_shape(sh1, sh2):
if not isinstance(sh1, (tuple, list)) or not isinstance(sh2, (tuple, list)):
raise TypeError
shape1 = tuple(operator.index(i) for i in sh1)
shape2 = tuple(operator.index(i) for i in sh2)
n1 = len(shape1)
n2 = len(shape2)
n = max(n1, n2)
shape = [0] * n
i = n - 1
while i>=0:
_n1 = n1 - n + i
d1 = shape1[_n1] if (_n1 >=0) else 1
_n2 = n2 - n + i
d2 = shape2[_n2] if (_n2 >=0) else 1
if d1 == 1:
shape[i] = d2
elif d2 == 1 or d2 == d1:
shape[i] = d1
else:
raise ValueError
i = i - 1
return tuple(shape)
With this implementation broadcasting from 0d array to empty 1d array is allowed, which is also consistent with NumPy.
I am not sure why this is a logic thing to do other than to stay compatible with NumPy.
In [1]: from broadcast import broadcasted_shape
In [2]: broadcasted_shape((1,), (0,))
Out[2]: (0,)
In [3]: broadcasted_shape(tuple(), (0,))
Out[3]: (0,)
In [4]: import numpy as np
In [5]: np.broadcast_to(np.array(0), (0,)).shape
Out[5]: (0,)
In [6]: np.broadcast_to(np.array([0]), (0,)).shape
Out[6]: (0,)
The purpose of this issue is to discuss this behavior. If this is as designed, please feel free to close.
Empty array behavior can be hard to reason about, but in general, size 0 dimensions are treated the same as any other dimension size (except 1) for broadcasting. It seems to me that the only reason to have empty arrays in the first place is to use them to keep track of dimensions even when the array itself has no elements, so it makes sense to me that size 0 dimensions shouldn't be special-cased.
Anyway, shouldn't it be the case that shape () always broadcasts with everything? This simplifies a lot of things that deal with broadcasting because () effectively serves as an identity element of the broadcasting operation.
My issue with permitting broadcasting of 0d and 1d arrays which contain 1 element to empty arrays that have no elements is with the loss of data:
In [1]: import numpy as np
In [2]: np.array([0]) + np.array([])
Out[2]: array([], dtype=float64)
One can sort of justify the empty result for 0d array (interpreted as a scalar) but hardly for 1d array. Perhaps it's just a quirk we should learn to live with.
The broadcasting behavior of empty arrays according to the spec is as intended. Namely, in general, when an operand is an empty array, one should get an empty array result.
In [1]: x = np.array([[[[1]]]])
In [2]: x.shape
Out[2]: (1, 1, 1, 1)
In [3]: y = np.array([[[[]]]])
In [4]: y.shape
Out[4]: (1, 1, 1, 0)
In [5]: x+y
Out[5]: array([], shape=(1, 1, 1, 0), dtype=float64)
An empty array effectively "poisons" all subsequent operations. IMO, this is reasonable behavior; otherwise, reasoning about the effect of empty arrays can be difficult.
In a sense, empty arrays are akin to NaN values where their presence is probably the result of unintended behavior, so letting them be, e.g., identity elements (e.g., x+y = x when y is empty) does not seem desirable.
Like NaN values they tend to work out well when intended also! That can happen easily: E.g. you could be gathering data for each day, but some days just don't contain any data. Now you can perfectly reasonably write an analysis pipeline without any special casing:
day = np.array([], dtype=np.float64) # input happens to be empty
day *= conversion_factor
days_sum = day.sum() # 0.0 which makes sense
day_remove_mean = day - day.mean()
days_std = day.std() # NaN, so you know something is "up"
If ~0-D~ empty arrays did not broadcast like this, you would have to special case days containing no data. OTOH, if it leads to unexpected results, I suspect you typically get a value (i.e. NaN) that notifies you of something wrong.
@kgryte Empty arrays are not consistently 'poisonous`. This behavior is specific when other array has size 1:
In [1]: import numpy as np
In [2]: np.array([1]) + np.array([])
Out[2]: array([], dtype=float64)
In [3]: np.array([[1]]) + np.array([])
Out[3]: array([], shape=(1, 0), dtype=float64)
It does not apply if size is > 1.
In [4]: np.array([1,2]) + np.array([])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-eb04955a9afa> in <module>
----> 1 np.array([1,2]) + np.array([])
ValueError: operands could not be broadcast together with shapes (2,) (0,)
In my opinion, this special-casing of size 1 arrays looks like a bug, but may choose to live with it for the sake of backward compatibility.
@oleksandr-pavlyk Correct. They are not consistency poisonous in the example you use as, if size is > 1 and the dimensions do not match, the arrays do not broadcast regardless (i.e., conforming array libraries could/should raise an exception). This is per broadcasting rules.
Stepping back, the broadcasting rules, as currently specified, don't carve out a special case for size 0 dimensions. They are applied the same whether a dimension is size 0 or has size n >= 1. Whenever corresponding dimensions do not equal 1, they must match. And whenever a dimension is 1, the output broadcasts to the size of the corresponding dimension in the other array. In this sense, 0 is not special.
Given that size 0 dimensions could introduce confusion, we should probably add examples demonstrating this in the broadcasting specification.
See https://github.com/data-apis/array-api/pull/407.