Lazy `iris.analysis.cartography.area_weights`

Open schlunma opened this issue 2 years ago • 1 comments

✨ Feature Request

Currently, iris.analysis.cartography.area_weights always returns a numpy array. Depending on the shape of the input cube, this can use up a lot of memory. It would be really helpful to have a lazy version of it that returns a dask array.

Motivation

Using a dask distributed scheduler in ESMValTool in combination with a preprocessor that requires the calculation of area weights is currently not possible. A lazy version of iris.analysis.cartography.area_weights would solve that.

Design

Solving this is probably easy since broadcast_to_shape that is used to broadcast the 2D weights to the cube shape now supports dask arrays. I see two options here regarding the API:

Add an additional keyword argument for iris.analysis.cartography.area_weights (maybe compute which is True by default?).
Let the behavior depend on the input data, i.e., if the cube has lazy data, the function returns lazy data and vice versa.

I guess option 1. is preferable since it is fully backwards-compatible.

If we can agree on an implementation, I can open a PR.

Dec 06 '23 12:12 schlunma

@SciTools/peloton

@schlunma thanks for raising this and offering to put up a PR. We think option 1 is much preferred to option 2.

Dec 13 '23 11:12 HGWright