intake-xarray icon indicating copy to clipboard operation
intake-xarray copied to clipboard

Discussion: rioxarray.open_rasterio() in addition to xarray.open_rasterio()?

Open scottyhq opened this issue 5 years ago • 10 comments

rioxarray is a nice library that extends xarray DataArrays for geospatial data. https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html. It is much like the 'geopandas' equivalent for xarray.

Essentially it provides 1) rioxarray.open_rasterio, a drop-in replacement for xarray.open_rasterio with optimizations for rasterio.open and 2) extends returned DataArray objects for handling of geospatial data (coordinate reference system, reprojection in particular). This added functionality would be particularly valuable for intake-stac (https://github.com/intake/intake-stac)

My question is how to organize the option for RasterIOSource to use rioxarray instead of xarray? One path forward could be to create an option for the opener function? https://github.com/intake/intake-xarray/blob/bd43095c3f418aaf6f5cda0d3fbdf108243611e7/intake_xarray/image.py#L128

related: https://github.com/intake/intake-xarray/issues/61

scottyhq avatar Oct 22 '20 19:10 scottyhq

One path forward could be to create an option for the opener function

Seems like a simple enough option. Yes, there is scope for a larger discussion about these similar and somewhat interchangeable loaders, similar to some engine= kw, where each option has different capabilities and configuration. I don't know the best way to structure it, which is why this became what I thought of as the simplest approach :)

martindurant avatar Oct 22 '20 19:10 martindurant

When rioxarray is installed xr now understands engine=rasterio. Does that solve the discussion @scottyhq ? And could we maybe even replace intake.open_rasterio with open_netcdf(engine="rasterio")?

aaronspring avatar Mar 19 '22 09:03 aaronspring

Deprecation warning also suggest moving towards engine="rasterio" and rioxarray

intake_xarray/tests/test_remote.py::test_http_read_rasterio_pattern
  /home/runner/work/intake-xarray/intake-xarray/intake_xarray/raster.py:62: DeprecationWarning: open_rasterio is Deprecated in favor of rioxarray. For information about transitioning, see: https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html
    das = [xr.open_rasterio(f, chunks=self.chunks, **self._kwargs)

intake_xarray/tests/test_remote.py::test_s3_read_rasterio
  /home/runner/work/intake-xarray/intake-xarray/intake_xarray/raster.py:90: DeprecationWarning: open_rasterio is Deprecated in favor of rioxarray. For information about transitioning, see: https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html
    self._ds = xr.open_rasterio(files, chunks=self.chunks,

aaronspring avatar Mar 24 '22 09:03 aaronspring

I suggest either to:

  • get rid of https://github.com/intake/intake-xarray/blob/master/intake_xarray/raster.py and rewrite all rasterio tests with the netcdf driver: breaks intake.open_rasterio
  • use the rasterio driver as a shortcut to netcdf(engine="rasterio"): keeps intake.open_rasterio

aaronspring avatar Mar 24 '22 09:03 aaronspring

I think indeed, the rasterio driver should become an alias for what is not netcdf; and maybe hey should all become just aliases for a generic xarray driver (including zarr?!). We should probably keep the current names and defaults at least for a while, though.

martindurant avatar Mar 29 '22 18:03 martindurant

the rasterio driver should become an alias for what is not netcdf

do you mean everything that is not opened by xarray?

maybe hey should all become just aliases for a generic xarray driver (including zarr?!)

its zarr like csv supported by intake?

We should probably keep the current names and defaults at least for a while, though.

definitely.


do you mean:

  • intake.open_xarray() refactoring intake.open_netcdf()
  • intake.open_rasterio() alias for open_xarray(engine="rasterio")
  • intake.open_zarr() alias for open_xarray(engine="zarr")
  • intake.open_cfgrib() alias for open_xarray(engine="cfgrib")
  • intake.open_something_else alias for current_rasterio_driver?

aaronspring avatar Mar 30 '22 12:03 aaronspring

I'm thinking of an arg of use_rioxarray to RasterIOSource which defaults to True.

Had a quick tinker here: https://github.com/raybellwaves/intake-rioxarray/blob/main/intake_rioxarray/catalog.py#L79 as a separate plugin but it should be added to this repo.

raybellwaves avatar Jun 09 '22 02:06 raybellwaves

@raybellwaves , perhaps it's best just to put it in a PR, so we can see by the tests and docstrings how to use the various options in practice? I would tend towards @aaronspring 's set of aliases - but actually it doesn't matter much, so long as we are super clear in our description. Ideally, any change would be backwards compatible, by which I mean that existing catalogs continue to function.

martindurant avatar Jun 13 '22 17:06 martindurant

@raybellwaves , do you plan on making a PR like this?

martindurant avatar Jun 22 '22 20:06 martindurant

@raybellwaves , do you plan on making a PR like this?

Not sure when I'll get round to it to be honest

raybellwaves avatar Jun 23 '22 00:06 raybellwaves

xarray v2023.04.0 removed the rasterio backend

so xr.open_rasterio no longer works in intake_xarray/raster.py.

droumis avatar Apr 28 '23 07:04 droumis

Fixed by #132

martindurant avatar May 17 '23 00:05 martindurant