xarray icon indicating copy to clipboard operation
xarray copied to clipboard

`Dataset.reduce` pass through non-numeric scalars

Open mathause opened this issue 1 year ago • 1 comments

  • [ ] Closes #xxxx
  • [ ] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst

Trying to reduce a non-numeric scalar results in an error. This PR passes them through unchanged.

import xarray as xr
ds = xr.Dataset(data_vars={"y": ((), "string")})
ds.mean()
TypeError: the resolved dtypes are not compatible with add.reduce. Resolved (dtype('<U6'), dtype('<U6'), dtype('<U12'))

  • I did not find an issue for that
  • Alternatively we could skip them (this is done with non-numeric arrays)
  • Numeric scalars are still reduced

mathause avatar Oct 10 '24 09:10 mathause

I'm pretty sure this was intentional -- mean of a string is not well defined.

Could we make this opt-in, by adding a new keyword argument -- perhaps something like ds.mean(skip_nonnumeric=True)?

As you note, there's also the question of whether to drop or pass through non-numeric values. I'm not sure which is more intuitive.

shoyer avatar Oct 20 '24 19:10 shoyer

Thanks for the review. I don't have the capacity to implement this, so I close this PR. (I think my scalars were actually coords, so there it was fine.)

mathause avatar Jun 05 '25 13:06 mathause