xarray icon indicating copy to clipboard operation
xarray copied to clipboard

Passing in DataArray into `np.linspace` breaks with Numpy 2

Open hoxbro opened this issue 1 year ago • 1 comments

What happened?

Trying to be compatible with Numpy 2, I discovered the following behavior change.

I could work around it, so it's not urgent for me to be fixed and can be marked as wontfix.

What did you expect to happen?

No response

Minimal Complete Verifiable Example

import numpy as np
import xarray as xr

arr = np.array([1,2,3,4])
xarr = xr.DataArray(arr, coords=[('x', arr)])
np.linspace(xarr[0], xarr[-1], len(xarr))

MVCE confirmation

  • [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [x] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [ ] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [ ] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

Traceback (most recent call last):
  File "/home/shh/projects/holoviz/repos/datashader/example.py", line 6, in <module>
    np.linspace(xarr[0], xarr[-1], len(xarr))
  File "/home/shh/miniconda3/envs/datashader-dev/lib/python3.12/site-packages/numpy/_core/function_base.py", line 189, in linspace
    y = conv.wrap(y.astype(dtype, copy=False))
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shh/miniconda3/envs/datashader-dev/lib/python3.12/site-packages/xarray/core/dataarray.py", line 4685, in __array_wrap__
    new_var = self.variable.__array_wrap__(obj, context)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shh/miniconda3/envs/datashader-dev/lib/python3.12/site-packages/xarray/core/variable.py", line 2294, in __array_wrap__
    return Variable(self.dims, obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shh/miniconda3/envs/datashader-dev/lib/python3.12/site-packages/xarray/core/variable.py", line 397, in __init__
    super().__init__(
  File "/home/shh/miniconda3/envs/datashader-dev/lib/python3.12/site-packages/xarray/namedarray/core.py", line 264, in __init__
    self._dims = self._parse_dimensions(dims)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shh/miniconda3/envs/datashader-dev/lib/python3.12/site-packages/xarray/namedarray/core.py", line 490, in _parse_dimensions
    raise ValueError(
ValueError: dimensions () must have the same length as the number of data dimensions, ndim=1

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS
------------------
commit: None
python: 3.12.3 | packaged by conda-forge | (main, Apr 15 2024, 18:38:13) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 6.8.0-76060800daily20240311-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.3
libnetcdf: 4.9.2

xarray: 2024.5.0
pandas: 2.2.2
numpy: 2.0.0rc2
scipy: 1.13.0
netCDF4: 1.6.5
pydap: None
h5netcdf: None
h5py: None
zarr: None
cftime: 1.6.3
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.5.1
distributed: None
matplotlib: 3.8.4
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.5.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 69.5.1
pip: 24.0
conda: 24.1.2
pytest: 7.4.4
mypy: None
IPython: 8.24.0
sphinx: None

hoxbro avatar May 23 '24 13:05 hoxbro

I've recently come across this as well (in a suddenly failing xcube unit test) and confirmed its presence in the current xarray repo version (commit b5180749d351f8b85fd39677bf137caaa90288a7). In the xcube case, we've worked around the immediate problem but have a few dozen other linspace calls in our codebase, so it would be doable but a non-trivial effort to go through them all and make sure they'll never encounter a DataArray.

pont-us avatar Jun 27 '24 07:06 pont-us

With numpy 1.26, the value returned by np.linspace in this case would be a np.ndarray.

In numpy 2.0, np.linspace calls the __array_wrap__ method of the first input value, which tries to create a Variable with the same dims as the first input, using the array returned by np.linspace on the values of the inputs. But the returned array is 1d, and the input has no dims.

I'm not sure there is a way to fix this in xarray, except for maybe creating some default dimension. Since np.linspace is not a "ufunc", it doesn't pass input and output shape info to the array wrapper (conv.wrap in the traceback), which might be used to do something more intelligent.

To fix this issue in our code, I added a function to replace np.linspace:

import numpy as np
import xarray as xr
from numpy.typing import DTypeLike
from numpy._typing import _ArrayLikeComplex_co


def xr_linspace_with_np_output(
    start: _ArrayLikeComplex_co | xr.DataArray,
    stop: _ArrayLikeComplex_co | xr.DataArray,
    num: int = 50,
    endpoint: bool = True,
    retstep: bool = False,
    dtype: DTypeLike | None = None,
    axis: int = 0,
) -> np.ndarray:
    """Wrapper around np.linspace to convert inputs to numpy.
    This mimics the behavior of calling `np.linspace` with `start` and
    `stop` selected from a `DataArray` from before numpy 2.0
    """
    if isinstance(start, xr.DataArray):
        start = start.values

    if isinstance(stop, xr.DataArray):
        stop = stop.values

    return np.linspace(start, stop, num, endpoint, retstep, dtype, axis)

@pont-us you could find and replace np.linspace with a wrapper like this and it should be fine, unless you need the more precise type hints for np.linspace: https://github.com/numpy/numpy/blob/v1.26.5/numpy/core/function_base.pyi

Edit: actually mypy still doesn't like this, probably because the values from the xr.DataArray aren't narrowed to something that is acceptable to np.linspace.

brendan-m-murphy avatar Mar 07 '25 11:03 brendan-m-murphy