Discussion: rioxarray.open_rasterio() in addition to xarray.open_rasterio()?
rioxarray is a nice library that extends xarray DataArrays for geospatial data. https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html. It is much like the 'geopandas' equivalent for xarray.
Essentially it provides 1) rioxarray.open_rasterio, a drop-in replacement for xarray.open_rasterio with optimizations for rasterio.open and 2) extends returned DataArray objects for handling of geospatial data (coordinate reference system, reprojection in particular). This added functionality would be particularly valuable for intake-stac (https://github.com/intake/intake-stac)
My question is how to organize the option for RasterIOSource to use rioxarray instead of xarray? One path forward could be to create an option for the opener function?
https://github.com/intake/intake-xarray/blob/bd43095c3f418aaf6f5cda0d3fbdf108243611e7/intake_xarray/image.py#L128
related: https://github.com/intake/intake-xarray/issues/61
One path forward could be to create an option for the opener function
Seems like a simple enough option.
Yes, there is scope for a larger discussion about these similar and somewhat interchangeable loaders, similar to some engine= kw, where each option has different capabilities and configuration. I don't know the best way to structure it, which is why this became what I thought of as the simplest approach :)
When rioxarray is installed xr now understands engine=rasterio. Does that solve the discussion @scottyhq ? And could we maybe even replace intake.open_rasterio with open_netcdf(engine="rasterio")?
Deprecation warning also suggest moving towards engine="rasterio" and rioxarray
intake_xarray/tests/test_remote.py::test_http_read_rasterio_pattern
/home/runner/work/intake-xarray/intake-xarray/intake_xarray/raster.py:62: DeprecationWarning: open_rasterio is Deprecated in favor of rioxarray. For information about transitioning, see: https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html
das = [xr.open_rasterio(f, chunks=self.chunks, **self._kwargs)
intake_xarray/tests/test_remote.py::test_s3_read_rasterio
/home/runner/work/intake-xarray/intake-xarray/intake_xarray/raster.py:90: DeprecationWarning: open_rasterio is Deprecated in favor of rioxarray. For information about transitioning, see: https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html
self._ds = xr.open_rasterio(files, chunks=self.chunks,
I suggest either to:
- get rid of https://github.com/intake/intake-xarray/blob/master/intake_xarray/raster.py and rewrite all
rasteriotests with the netcdf driver: breaksintake.open_rasterio - use the rasterio driver as a shortcut to
netcdf(engine="rasterio"): keepsintake.open_rasterio
I think indeed, the rasterio driver should become an alias for what is not netcdf; and maybe hey should all become just aliases for a generic xarray driver (including zarr?!). We should probably keep the current names and defaults at least for a while, though.
the rasterio driver should become an alias for what is not netcdf
do you mean everything that is not opened by xarray?
maybe hey should all become just aliases for a generic xarray driver (including zarr?!)
its zarr like csv supported by intake?
We should probably keep the current names and defaults at least for a while, though.
definitely.
do you mean:
-
intake.open_xarray()refactoringintake.open_netcdf() -
intake.open_rasterio()alias foropen_xarray(engine="rasterio") -
intake.open_zarr()alias foropen_xarray(engine="zarr") -
intake.open_cfgrib()alias foropen_xarray(engine="cfgrib") -
intake.open_something_elsealias forcurrent_rasterio_driver?
I'm thinking of an arg of use_rioxarray to RasterIOSource which defaults to True.
Had a quick tinker here: https://github.com/raybellwaves/intake-rioxarray/blob/main/intake_rioxarray/catalog.py#L79 as a separate plugin but it should be added to this repo.
@raybellwaves , perhaps it's best just to put it in a PR, so we can see by the tests and docstrings how to use the various options in practice? I would tend towards @aaronspring 's set of aliases - but actually it doesn't matter much, so long as we are super clear in our description. Ideally, any change would be backwards compatible, by which I mean that existing catalogs continue to function.
@raybellwaves , do you plan on making a PR like this?
@raybellwaves , do you plan on making a PR like this?
Not sure when I'll get round to it to be honest
xarray v2023.04.0 removed the rasterio backend
so xr.open_rasterio no longer works in intake_xarray/raster.py.
Fixed by #132