array-api Broadcasting rule allows broadcasting 0d array to empty 1d array

Here is an implementation of the broadcasting algorithm from the spec:

import operator

def broadcasted_shape(sh1, sh2):
    if not isinstance(sh1, (tuple, list)) or not isinstance(sh2, (tuple, list)):
        raise TypeError
    shape1 = tuple(operator.index(i) for i in sh1)
    shape2 = tuple(operator.index(i) for i in sh2)
    n1 = len(shape1)
    n2 = len(shape2)
    n = max(n1, n2)
    shape = [0] * n
    i = n - 1
    while i>=0:
        _n1 = n1 - n + i
        d1 = shape1[_n1] if (_n1 >=0) else 1
        _n2 = n2 - n + i
        d2 = shape2[_n2] if (_n2 >=0) else 1
        if d1 == 1:
            shape[i] = d2
        elif d2 == 1 or d2 == d1:
            shape[i] = d1
        else:
            raise ValueError
        i = i - 1
    return tuple(shape)

With this implementation broadcasting from 0d array to empty 1d array is allowed, which is also consistent with NumPy.

I am not sure why this is a logic thing to do other than to stay compatible with NumPy.

In [1]: from broadcast import broadcasted_shape

In [2]: broadcasted_shape((1,), (0,))
Out[2]: (0,)

In [3]: broadcasted_shape(tuple(), (0,))
Out[3]: (0,)

In [4]: import numpy as np

In [5]: np.broadcast_to(np.array(0), (0,)).shape
Out[5]: (0,)

In [6]: np.broadcast_to(np.array([0]), (0,)).shape
Out[6]: (0,)

The purpose of this issue is to discuss this behavior. If this is as designed, please feel free to close.

Mar 18 '22 20:03 oleksandr-pavlyk

Empty array behavior can be hard to reason about, but in general, size 0 dimensions are treated the same as any other dimension size (except 1) for broadcasting. It seems to me that the only reason to have empty arrays in the first place is to use them to keep track of dimensions even when the array itself has no elements, so it makes sense to me that size 0 dimensions shouldn't be special-cased.

Anyway, shouldn't it be the case that shape () always broadcasts with everything? This simplifies a lot of things that deal with broadcasting because () effectively serves as an identity element of the broadcasting operation.

Mar 18 '22 21:03 asmeurer

My issue with permitting broadcasting of 0d and 1d arrays which contain 1 element to empty arrays that have no elements is with the loss of data:

In [1]: import numpy as np

In [2]: np.array([0]) + np.array([])
Out[2]: array([], dtype=float64)

One can sort of justify the empty result for 0d array (interpreted as a scalar) but hardly for 1d array. Perhaps it's just a quirk we should learn to live with.

Mar 19 '22 00:03 oleksandr-pavlyk

The broadcasting behavior of empty arrays according to the spec is as intended. Namely, in general, when an operand is an empty array, one should get an empty array result.

In [1]: x = np.array([[[[1]]]])                                                                                                 

In [2]: x.shape                                                                                                                 
Out[2]: (1, 1, 1, 1)

In [3]: y = np.array([[[[]]]])                                                                                                  

In [4]: y.shape                                                                                                                 
Out[4]: (1, 1, 1, 0)

In [5]: x+y                                                                                                                     
Out[5]: array([], shape=(1, 1, 1, 0), dtype=float64)

An empty array effectively "poisons" all subsequent operations. IMO, this is reasonable behavior; otherwise, reasoning about the effect of empty arrays can be difficult.

In a sense, empty arrays are akin to NaN values where their presence is probably the result of unintended behavior, so letting them be, e.g., identity elements (e.g., x+y = x when y is empty) does not seem desirable.

Mar 21 '22 08:03 kgryte

Like NaN values they tend to work out well when intended also! That can happen easily: E.g. you could be gathering data for each day, but some days just don't contain any data. Now you can perfectly reasonably write an analysis pipeline without any special casing:

day = np.array([], dtype=np.float64)  # input happens to be empty
day *= conversion_factor
days_sum = day.sum()  # 0.0 which makes sense
day_remove_mean = day - day.mean()
days_std = day.std()  # NaN, so you know something is "up"

If ~0-D~ empty arrays did not broadcast like this, you would have to special case days containing no data. OTOH, if it leads to unexpected results, I suspect you typically get a value (i.e. NaN) that notifies you of something wrong.

Mar 21 '22 15:03 seberg

@kgryte Empty arrays are not consistently 'poisonous`. This behavior is specific when other array has size 1:

In [1]: import numpy as np

In [2]: np.array([1]) + np.array([])
Out[2]: array([], dtype=float64)

In [3]: np.array([[1]]) + np.array([])
Out[3]: array([], shape=(1, 0), dtype=float64)

It does not apply if size is > 1.

In [4]: np.array([1,2]) + np.array([])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-eb04955a9afa> in <module>
----> 1 np.array([1,2]) + np.array([])

ValueError: operands could not be broadcast together with shapes (2,) (0,)

In my opinion, this special-casing of size 1 arrays looks like a bug, but may choose to live with it for the sake of backward compatibility.

Mar 21 '22 15:03 oleksandr-pavlyk

@oleksandr-pavlyk Correct. They are not consistency poisonous in the example you use as, if size is > 1 and the dimensions do not match, the arrays do not broadcast regardless (i.e., conforming array libraries could/should raise an exception). This is per broadcasting rules.

Stepping back, the broadcasting rules, as currently specified, don't carve out a special case for size 0 dimensions. They are applied the same whether a dimension is size 0 or has size n >= 1. Whenever corresponding dimensions do not equal 1, they must match. And whenever a dimension is 1, the output broadcasts to the size of the corresponding dimension in the other array. In this sense, 0 is not special.

Mar 24 '22 08:03 kgryte

Given that size 0 dimensions could introduce confusion, we should probably add examples demonstrating this in the broadcasting specification.

See https://github.com/data-apis/array-api/pull/407.

Mar 24 '22 08:03 kgryte