geowombat icon indicating copy to clipboard operation
geowombat copied to clipboard

Nodata value set to 0 if using ref_bounds

Open rdenham opened this issue 4 years ago • 8 comments

When you open an image using gw.config.update(refbounds=bound), any nodata values in the source data are converted to 0's.

Here is a reproducible example:

# create a dummy raster with nodata set to 255
from osgeo import gdal
from osgeo import osr
from rasterio.coords import BoundingBox

import numpy as np 

ncol = 100
nrow = 100

egdata = np.random.randint(0,10, (nrow,ncol))
# add some nodata
egdata[0:10, 0:10] = 255

rasterOrigin = (1238095,-2305756)
pixelWidth = 10
pixelHeight = -10
originX = rasterOrigin[0]
originY = rasterOrigin[1]
driver = gdal.GetDriverByName('GTiff')
outRaster = driver.Create("test.tif", ncol, nrow, 1, gdal.GDT_Byte)
outRaster.SetGeoTransform((originX, pixelWidth, 0, originY, 0, pixelHeight))
outband = outRaster.GetRasterBand(1)
outband.WriteArray(egdata)
outband.SetNoDataValue(255)
outband.FlushCache()
outRasterSRS = osr.SpatialReference()
outRasterSRS.ImportFromEPSG(3577)
outRaster.SetProjection(outRasterSRS.ExportToWkt())
outband = None
outRaster = None

Now, if you access this using gw:

with gw.config.update():
    with gw.open("test.tif") as src:
        data = src.data.compute()
        print(data[0,0:10,0:10])

you get the expected output, since these are all nodata values in the src

[[255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]]

But if you use a bounds argument:

bounds = BoundingBox(left=rasterOrigin[0], bottom=rasterOrigin[1] - 60, right=rasterOrigin[0] + 60, top=rasterOrigin[1])
with gw.config.update(ref_bounds=bounds):
   with gw.open("test.tif") as src:
       data = src.data.compute()
       print(data[0,0:10, 0:10])

You get all 0's:

[[0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]]

As a comparison, rasterio behaves as expected:

with rasterio.open("test.tif") as src:
    rst = src.read(1, window=from_bounds(left=rasterOrigin[0], bottom=rasterOrigin[1] - 60, right=rasterOrigin[0] + 60, top=rasterOrigin[1], transform=src.transform))
    print(rst[0:10, 0:10])
[[255 255 255 255 255 255]
 [255 255 255 255 255 255]
 [255 255 255 255 255 255]
 [255 255 255 255 255 255]
 [255 255 255 255 255 255]
 [255 255 255 255 255 255]]

I experimented with nodata in the config and open but didn't seem to make any difference.

In [68]: print(gw.__version__)
1.7.3

rdenham avatar Feb 04 '22 00:02 rdenham

I think this is because nodata doesn't get passed to warp_open in api.py, about line 491.

I'll prepare a pull request.

rdenham avatar Feb 11 '22 03:02 rdenham

Actually I don't think I can do a pull request without forking first. Perhaps someone could do this for me.

See the lines https://github.com/jgrss/geowombat/blob/e691102a3dcce13b272810b43bc9586681ae6934/geowombat/core/api.py#L491-L497

I think the difference is simply:

❯ git diff --ignore-space-at-eol HEAD^^ HEAD geowombat/core/api.py 
diff --git a/geowombat/core/api.py b/geowombat/core/api.py
index 77672e46..03de56b8 100755
--- a/geowombat/core/api.py
+++ b/geowombat/core/api.py
@@ -493,6 +493,7 @@ class open(object):
                                           resampling=resampling,
                                           dtype=dtype,
                                           netcdf_vars=netcdf_vars,
+                                          nodata=nodata,
                                           **kwargs)
 
                 else:

rdenham avatar Feb 11 '22 04:02 rdenham

Thanks @rdenham, I'll take a look at this issue.

jgrss avatar Feb 11 '22 23:02 jgrss

@rdenham See this PR.

Part of the issue was, as you pointed out, 'nodata' was not being passed to the warper. The other issue was that 'nodata' was not being retrieved from the global config.

This example:

with gw.config.update():
    with gw.open("test.tif") as src:
        data = src.data.compute()
        print(data[0,0:10,0:10])

works because it passes the file name to Xarray and it pulls the nodata value directly.

For this:

bounds = BoundingBox(left=rasterOrigin[0], bottom=rasterOrigin[1] - 60, right=rasterOrigin[0] + 60, top=rasterOrigin[1])
with gw.config.update(ref_bounds=bounds):
   with gw.open("test.tif") as src:
       data = src.data.compute()
       print(data[0,0:10, 0:10])

to work, you will have to pass the 'nodata' value explicitly like:

bounds = BoundingBox(left=rasterOrigin[0], bottom=rasterOrigin[1] - 60, right=rasterOrigin[0] + 60, top=rasterOrigin[1])
with gw.config.update(ref_bounds=bounds, nodata=255):
   with gw.open("test.tif") as src:
       data = src.data.compute()
       print(data[0,0:10, 0:10])

or

bounds = BoundingBox(left=rasterOrigin[0], bottom=rasterOrigin[1] - 60, right=rasterOrigin[0] + 60, top=rasterOrigin[1])
with gw.config.update(ref_bounds=bounds):
   with gw.open("test.tif", nodata=255) as src:
       data = src.data.compute()
       print(data[0,0:10, 0:10])

jgrss avatar Feb 12 '22 00:02 jgrss

Thanks @jgrss

I think I misunderstood the meaning of nodata in gw.open (and gw.config.update). I realise now that it should be interpreted as a replacement value; a value which will replace all nodata values in your dataset; something like

nodata (Optional[float | int]): A 'no data' value to set. All nodata values in the image will be replaced with this value.  Default is 0.

If that is the correct interpretation, shouldn't the following

with gw.open("test.tif", nodata=0) as src:
    data = src.data.compute()
    print(data[0,0:10, 0:10])   

print out an array of 0's instead of 255's?

rdenham avatar Feb 12 '22 01:02 rdenham

You're right regarding the change in description. I will also update it to explain that it only works when warping an image. When you open an image with no modifications then the no data value is taken from the file and not modified.

jgrss avatar Feb 12 '22 01:02 jgrss

You're right regarding the change in description. I will also update it to explain that it only works when warping an image. When you open an image with no modifications then the no data value is taken from the file and not modified.

Maybe it should be removed from open()and only be used in the config. I'll do some cleaning.

jgrss avatar Feb 12 '22 01:02 jgrss

Maybe it should be removed from open()and only be used in the config. I'll do some cleaning.

Just to complicate this, I think nodata will only take effect if you also pass in a ref_bounds object, since without this you won't be using warping.

I think I agree, use it only in config, and add documentation that if you aren't warping by eg passing bounds, then it will be ignored.

rdenham avatar Feb 12 '22 03:02 rdenham

Closing this issue following #204

jgrss avatar Sep 22 '22 02:09 jgrss