Wrapper around metpy.calc.tke errors
What went wrong?
On the following code, I get a ValueError: different number of dimensions on data and dims: 3 vs 4, which I would not expect to be thrown.
Operating System
Linux
Version
1.3.1
Python Version
3.10
Code to Reproduce
import xarray as xr
from metpy.calc import tke
import numpy as np
u = v = w = xr.Variable(('Time', 'z', 'y', 'x'), np.random.rand(10, 30, 100, 100), attrs={'units': 'm/s'})
ds = xr.Dataset(data_vars={'U': u, 'V': v, 'W': w}, coords={'Time': np.arange(10), 'z': np.arange(30), 'y': np.arange(100), 'x': np.arange(100)}).metpy.quantify() #quantify not necessary
tke(ds['U'], ds['V'], ds['W'], axis=0)
Errors, Traceback, and Logs
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In [4], line 7
5 u = v = w = xr.Variable(('Time', 'z', 'y', 'x'), np.random.rand(10, 30, 100, 100), attrs={'units': 'm/s'})
6 ds = xr.Dataset(data_vars={'U': u, 'V': v, 'W': w}, coords={'Time': np.arange(10), 'z': np.arange(30), 'y': np.arange(100), 'x': np.arange(100)}).metpy.quantify()
----> 7 tke(ds['U'], ds['V'], ds['W'], axis=0)
File ~/.conda/envs/wrf_proc/lib/python3.10/site-packages/metpy/xarray.py:1249, in preprocess_and_wrap.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
1247 return tuple(wrapping(*args) for args in zip(result, match))
1248 else:
-> 1249 return wrapping(result, match)
File ~/.conda/envs/wrf_proc/lib/python3.10/site-packages/metpy/xarray.py:1288, in _wrap_output_like_not_matching_units(result, match)
1279 if (
1280 not isinstance(result, units.Quantity)
1281 and (
(...)
1284 )
1285 ):
1286 result = units.Quantity(result)
1287 return (
-> 1288 xr.DataArray(result, coords=match.coords, dims=match.dims) if output_xarray
1289 else result
1290 )
File ~/.conda/envs/wrf_proc/lib/python3.10/site-packages/xarray/core/dataarray.py:413, in DataArray.__init__(self, data, coords, dims, name, attrs, indexes, fastpath)
411 data = _check_data_shape(data, coords, dims)
412 data = as_compatible_data(data)
--> 413 coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
414 variable = Variable(dims, data, attrs, fastpath=True)
415 indexes, coords = _create_indexes_from_coords(coords)
File ~/.conda/envs/wrf_proc/lib/python3.10/site-packages/xarray/core/dataarray.py:130, in _infer_coords_and_dims(shape, coords, dims)
128 dims = tuple(dims)
129 elif len(dims) != len(shape):
--> 130 raise ValueError(
131 "different number of dimensions on data "
132 f"and dims: {len(shape)} vs {len(dims)}"
133 )
134 else:
135 for d in dims:
ValueError: different number of dimensions on data and dims: 3 vs 4
Hi @lpilz,
Thanks for raising the issue. I've dug a bit and it appears that our tke function is failing with its use of xarray DataArray objects. I don't think we have touched this part of the code in a long time and has not been updated to work with that type of data object (although we should and would welcome any PR to help improve our code base). As a work around currently, you could use the following:
import xarray as xr
from metpy.calc import tke
import numpy as np
u = v = w = xr.Variable(('Time', 'z', 'y', 'x'), np.random.rand(10, 30, 100, 100), attrs={'units': 'm/s'})
ds = xr.Dataset(data_vars={'U': u, 'V': v, 'W': w}, coords={'Time': np.arange(10),
'z': np.arange(30),
'y': np.arange(100),
'x': np.arange(100)}).metpy.quantify() #quantify not necessary
tke(ds['U'].data, ds['V'].data, ds['W'].data, axis=0)
Note the added .data to the calls in the tke function. This preserves the pint unit array, but gets rid of all of the coordinate information.
@kgoebber's work-around is good.
Digging in, the problem is that tke is essentially performing a reduction along the given axis, and our preprocess_and_wrap() doesn't handle that. (Correct me if I'm wrong @jthielen ?)
The only functions we have that act similarly are things like precipitable_water and cape, which don't work on grids yet and don't return xarray objects.
The easy fix is to change the use of preprocess_and_wrap() in turbulence.py and remove the use of wrap_like since it's not correct, but this would result in returning regular pint-wrapped numpy arrays. I wonder if we could make preprocess_and_wrap look for an axis argument and automatically do what we would want?
Another thing we should consider in this is whether we should add a version of add_vertical_dim_from_xarray (or even just generalize that one) so that we automatically use the time dimension in those functions. This would require being careful to maintain backwards compatibility.
I'll note there's a small window here where we know noone can possibly be passing in xarray data to these functions successfully (there's no way the wrapping will work), so there's nothing to maintain backwards with regards to behavior given xarray data.