Nodata value set to 0 if using ref_bounds
When you open an image using gw.config.update(refbounds=bound), any nodata values in the source data are converted to 0's.
Here is a reproducible example:
# create a dummy raster with nodata set to 255
from osgeo import gdal
from osgeo import osr
from rasterio.coords import BoundingBox
import numpy as np
ncol = 100
nrow = 100
egdata = np.random.randint(0,10, (nrow,ncol))
# add some nodata
egdata[0:10, 0:10] = 255
rasterOrigin = (1238095,-2305756)
pixelWidth = 10
pixelHeight = -10
originX = rasterOrigin[0]
originY = rasterOrigin[1]
driver = gdal.GetDriverByName('GTiff')
outRaster = driver.Create("test.tif", ncol, nrow, 1, gdal.GDT_Byte)
outRaster.SetGeoTransform((originX, pixelWidth, 0, originY, 0, pixelHeight))
outband = outRaster.GetRasterBand(1)
outband.WriteArray(egdata)
outband.SetNoDataValue(255)
outband.FlushCache()
outRasterSRS = osr.SpatialReference()
outRasterSRS.ImportFromEPSG(3577)
outRaster.SetProjection(outRasterSRS.ExportToWkt())
outband = None
outRaster = None
Now, if you access this using gw:
with gw.config.update():
with gw.open("test.tif") as src:
data = src.data.compute()
print(data[0,0:10,0:10])
you get the expected output, since these are all nodata values in the src
[[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]]
But if you use a bounds argument:
bounds = BoundingBox(left=rasterOrigin[0], bottom=rasterOrigin[1] - 60, right=rasterOrigin[0] + 60, top=rasterOrigin[1])
with gw.config.update(ref_bounds=bounds):
with gw.open("test.tif") as src:
data = src.data.compute()
print(data[0,0:10, 0:10])
You get all 0's:
[[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]]
As a comparison, rasterio behaves as expected:
with rasterio.open("test.tif") as src:
rst = src.read(1, window=from_bounds(left=rasterOrigin[0], bottom=rasterOrigin[1] - 60, right=rasterOrigin[0] + 60, top=rasterOrigin[1], transform=src.transform))
print(rst[0:10, 0:10])
[[255 255 255 255 255 255]
[255 255 255 255 255 255]
[255 255 255 255 255 255]
[255 255 255 255 255 255]
[255 255 255 255 255 255]
[255 255 255 255 255 255]]
I experimented with nodata in the config and open but didn't seem to make any difference.
In [68]: print(gw.__version__)
1.7.3
I think this is because nodata doesn't get passed to warp_open in api.py, about line 491.
I'll prepare a pull request.
Actually I don't think I can do a pull request without forking first. Perhaps someone could do this for me.
See the lines https://github.com/jgrss/geowombat/blob/e691102a3dcce13b272810b43bc9586681ae6934/geowombat/core/api.py#L491-L497
I think the difference is simply:
❯ git diff --ignore-space-at-eol HEAD^^ HEAD geowombat/core/api.py
diff --git a/geowombat/core/api.py b/geowombat/core/api.py
index 77672e46..03de56b8 100755
--- a/geowombat/core/api.py
+++ b/geowombat/core/api.py
@@ -493,6 +493,7 @@ class open(object):
resampling=resampling,
dtype=dtype,
netcdf_vars=netcdf_vars,
+ nodata=nodata,
**kwargs)
else:
Thanks @rdenham, I'll take a look at this issue.
@rdenham See this PR.
Part of the issue was, as you pointed out, 'nodata' was not being passed to the warper. The other issue was that 'nodata' was not being retrieved from the global config.
This example:
with gw.config.update():
with gw.open("test.tif") as src:
data = src.data.compute()
print(data[0,0:10,0:10])
works because it passes the file name to Xarray and it pulls the nodata value directly.
For this:
bounds = BoundingBox(left=rasterOrigin[0], bottom=rasterOrigin[1] - 60, right=rasterOrigin[0] + 60, top=rasterOrigin[1])
with gw.config.update(ref_bounds=bounds):
with gw.open("test.tif") as src:
data = src.data.compute()
print(data[0,0:10, 0:10])
to work, you will have to pass the 'nodata' value explicitly like:
bounds = BoundingBox(left=rasterOrigin[0], bottom=rasterOrigin[1] - 60, right=rasterOrigin[0] + 60, top=rasterOrigin[1])
with gw.config.update(ref_bounds=bounds, nodata=255):
with gw.open("test.tif") as src:
data = src.data.compute()
print(data[0,0:10, 0:10])
or
bounds = BoundingBox(left=rasterOrigin[0], bottom=rasterOrigin[1] - 60, right=rasterOrigin[0] + 60, top=rasterOrigin[1])
with gw.config.update(ref_bounds=bounds):
with gw.open("test.tif", nodata=255) as src:
data = src.data.compute()
print(data[0,0:10, 0:10])
Thanks @jgrss
I think I misunderstood the meaning of nodata in gw.open (and gw.config.update). I realise now that it should be interpreted as a replacement value; a value which will replace all nodata values in your dataset; something like
nodata (Optional[float | int]): A 'no data' value to set. All nodata values in the image will be replaced with this value. Default is 0.
If that is the correct interpretation, shouldn't the following
with gw.open("test.tif", nodata=0) as src:
data = src.data.compute()
print(data[0,0:10, 0:10])
print out an array of 0's instead of 255's?
You're right regarding the change in description. I will also update it to explain that it only works when warping an image. When you open an image with no modifications then the no data value is taken from the file and not modified.
You're right regarding the change in description. I will also update it to explain that it only works when warping an image. When you open an image with no modifications then the no data value is taken from the file and not modified.
Maybe it should be removed from open()and only be used in the config. I'll do some cleaning.
Maybe it should be removed from open()and only be used in the config. I'll do some cleaning.
Just to complicate this, I think nodata will only take effect if you also pass in a ref_bounds object, since without this you won't be using warping.
I think I agree, use it only in config, and add documentation that if you aren't warping by eg passing bounds, then it will be ignored.
Closing this issue following #204