rioxarray icon indicating copy to clipboard operation
rioxarray copied to clipboard

How to overwrite raster opened with rioxarray.open_rasterio ?

Open openSourcerer9000 opened this issue 3 years ago • 3 comments

using rioxarray.open_rasterio and subsequently rio.to_raster to edit a raster file raises a permission denied error. I can't figure out any way make changes to a TIF and then overwrite it again. Calling .close() on the dataarray seems to have no effect either.

tif = rioxarray.open_rasterio(tifpath)
tif.close()
tif.rio.to_raster(tifpath)

CPLE_AppDefinedError: Deleting ....tif failed: Permission denied

Expected behavior: overwrite the file, and tif.close() statement to be unnecessary.

As I understand it, xarray reads serialized data lazily, which is why open_rasterio would leave the file open? If that's not accurate, then open_rasterio should either close the file itself, or be implemented as a context manager. Otherwise, it makes sense that rio.to_raster() should be closing the file before overwriting it.

As of now, the only way I've been able to edit a raster is to save as a different file, close the python kernel, and manually copy the new file over, which defeats the purpose of using python in the first place.

I'm on windows FYI rioxarray 0.9.0 xarray 0.20.0

openSourcerer9000 avatar Mar 08 '22 17:03 openSourcerer9000

Short answer: I don't recommend overwriting the same file you opened. Instead, write to a new file. Potentially useful: https://rasterio.readthedocs.io/en/latest/api/rasterio.shutil.html Related: #477; https://github.com/pydata/xarray/issues/2887

rioxarray lazily loads the data from the file you are reading from. Even if you close the file before writing to disk, it has to re-open the file in order to load the data. At the same time this is happening, it has opened up a path to the file to write to. One potential workaround is to load in the data from memory before closing the file and then writing to disk (See: Dataset.load). However, that may or may not be the end of the hurdles to overcome.

Here is an example of something to try:

with rioxarray.open_rasterio(tifpath) as tif:
    data = tif.load()
data.rio.to_raster(tifpath)

snowman2 avatar Mar 08 '22 20:03 snowman2

That explanation makes sense. The code snippet doesn't work however, it still never closes the file once it exits the context manager. If you later print tif it will read out your DataArray again.

I believe the purpose of computer files in general is to access data without having to hold it all in memory. The problem is that there seems to be no possible way to ever close a raster opened with rioxarray.open_rasterio, without killing the Python kernel.

It should be possible, whether under the hood or user-implemented, to at least read the file, make changes, write to a separate temporary file, then close the original file and later overwrite it with the temporary file. It seems the ability to actually close the file gets lost in the abstraction to open_rasterio.

openSourcerer9000 avatar Mar 09 '22 22:03 openSourcerer9000

This is the code that should make the file closable: https://github.com/corteva/rioxarray/blob/2888deb1e2abeccf1b7b531378a58762d40685aa/rioxarray/_io.py#L948-L950

See: DataArray.set_close & Dataset.set_close.

If you later print tif it will read out your DataArray again.

When calling load it pulls all of the data into the xarray.DataArray instead of lazily loading. Being able to access data inside of the DataArray object isn't necessarily an indicator that the file isn't closed.

You should still be able to access the data in tif after it closes:

with rioxarray.open_rasterio(tifpath) as tif:
   tif.load()
tif.rio.to_raster(tifpath)

Though, there is a possibility something isn't being closed. I recommend looking here and potentially digging into the xarray code.

snowman2 avatar Mar 10 '22 15:03 snowman2