Unable to use dataarray if its a variable of a dataset
Description of the problem
If a dataarray has been converted to a DataSet with xarray.DataArray.to_dataset(), calling the dataarray in PyGMT modules fails. See below that output of grid and dataset.height are equivalent.
Full code that generated the error
import pygmt
import xarray as xr
region = [-28, -10, 62, 68]
grid = pygmt.datasets.load_earth_relief(resolution="03m", region=region)
dataset = grid.to_dataset(name='height')
print(pygmt.grdinfo(grid))
print(pygmt.grdinfo(dataset.height))
Full error message
grdcut [NOTICE]: Remote data courtesy of GMT data server oceania [http://oceania.generic-mapping-tools.org]
grdcut [NOTICE]: SRTM15 Earth Relief at 20x20 arc minutes reduced by Gaussian Cartesian filtering (37.1 km fullwidth) [Tozer et al., 2019].
grdcut [NOTICE]: -> Download grid file [831K]: earth_relief_20m_p.grd
: Title:
: Command:
: Remark:
: Pixel node registration used [Geographic grid]
: Grid file format: nf = GMT netCDF format (32-bit float), CF-1.7
: x_min: -28 x_max: -10 x_inc: 0.333333333333 (20 min) name: x n_columns: 54
: y_min: 62 y_max: 68 y_inc: 0.333333333333 (20 min) name: y n_rows: 18
: v_min: -2240 v_max: 1664 name: z
: scale_factor: 1 add_offset: 0
: format: classic
: Default CPT:
grdinfo [ERROR]: File /tmp/pygmt-yyhpy7cb.nc was not found
grdinfo [ERROR]: Cannot find file /tmp/pygmt-yyhpy7cb.nc
grdinfo [ERROR]: Must specify one or more input files
Output exceeds the size limit. Open the full output data in a text editor
---------------------------------------------------------------------------
GMTCLibError Traceback (most recent call last)
Untitled-1.ipynb Cell 1' in <cell line: 11>()
8 dataset = grid.to_dataset(name='height')
10 print(pygmt.grdinfo(grid))
---> 11 print(pygmt.grdinfo(dataset.height))
File /home/tankerma/miniconda/envs/grav_inv2/lib/python3.10/site-packages/pygmt/helpers/decorators.py:585, in use_alias.<locals>.alias_decorator.<locals>.new_module(*args, **kwargs)
580 msg = (
581 f"Short-form parameter ({short_param}) is not recommended. "
582 f"Use long-form parameter '{long_alias}' instead."
583 )
584 warnings.warn(msg, category=SyntaxWarning, stacklevel=2)
--> 585 return module_func(*args, **kwargs)
File /home/tankerma/miniconda/envs/grav_inv2/lib/python3.10/site-packages/pygmt/helpers/decorators.py:725, in kwargs_to_strings.<locals>.converter.<locals>.new_module(*args, **kwargs)
723 kwargs[arg] = separators[fmt].join(f"{item}" for item in value)
724 # Execute the original function and return its output
--> 725 return module_func(*args, **kwargs)
File /home/tankerma/miniconda/envs/grav_inv2/lib/python3.10/site-packages/pygmt/src/grdinfo.py:116, in grdinfo(grid, **kwargs)
114 with Session() as lib:
115 file_context = lib.virtualfile_from_data(check_kind="raster", data=grid)
--> 116 with file_context as infile:
117 lib.call_module(
...
GMTCLibError: Module 'grdinfo' failed with status code 72:
grdinfo [ERROR]: File /tmp/pygmt-yyhpy7cb.nc was not found
grdinfo [ERROR]: Cannot find file /tmp/pygmt-yyhpy7cb.nc
grdinfo [ERROR]: Must specify one or more input files
Output of running grid
xarray.DataArray'elevation'lat: 18lon: 54
-1.462e+03 -1.444e+03 -1.423e+03 ... -1.864e+03 -1.887e+03 -1.675e+03
Coordinates:
lon
(lon)
float64
-27.83 -27.5 ... -10.5 -10.17
lat
(lat)
float64
62.17 62.5 62.83 ... 67.5 67.83
Attributes:
long_name :
elevation relative to the geoid
cpt :
geo
units :
meters
vertical_datum :
EMG96
horizontal_datum :
WGS84
Output of running dataset.height
xarray.DataArray'height'lat: 18lon: 54
-1.462e+03 -1.444e+03 -1.423e+03 ... -1.864e+03 -1.887e+03 -1.675e+03
Coordinates:
lon
(lon)
float64
-27.83 -27.5 ... -10.5 -10.17
lat
(lat)
float64
62.17 62.5 62.83 ... 67.5 67.83
Attributes:
long_name :
elevation relative to the geoid
cpt :
geo
units :
meters
vertical_datum :
EMG96
horizontal_datum :
WGS84
System information
Please paste the output of python -c "import pygmt; pygmt.show_versions()":
PyGMT information:
version: v0.6.1
System information:
python: 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:04:59) [GCC 10.3.0]
executable: /home/tankerma/miniconda/envs/grav_inv2/bin/python
machine: Linux-5.4.0-96-generic-x86_64-with-glibc2.31
Dependency information:
numpy: 1.21.6
pandas: 1.4.2
xarray: 2022.3.0
netCDF4: 1.5.8
packaging: 21.3
geopandas: 0.10.2
ghostscript: 9.54.0
gmt: 6.4.0
GMT library information:
binary dir: /home/tankerma/miniconda/envs/grav_inv2/bin
cores: 112
grid layout: rows
library path: /home/tankerma/miniconda/envs/grav_inv2/lib/libgmt.so
padding: 2
plugin dir: /home/tankerma/miniconda/envs/grav_inv2/lib/gmt/plugins
share dir: /home/tankerma/miniconda/envs/grav_inv2/share/gmt
version: 6.4.0
I can confirm the issue but need some time to see why it doesn't work.
Okay great, thanks for looking into it :smile:
The issue is related to https://github.com/GenericMappingTools/pygmt/issues/524.
Here is another example to reproduce the issue after PR https://github.com/GenericMappingTools/pygmt/pull/2009 is merged:
from pygmt.datasets import load_earth_relief
grid = load_earth_relief(
resolution="01d", region=[0, 5, -5, 5], registration="pixel"
)
# check the original grid
assert len(grid.encoding["source"]) > 0
assert grid.gmt.registration == 1
assert grid.gmt.gtype == 1
# generate a new dataset
dataset = grid.to_dataset(name="height")
# source file is given but not found
assert len(dataset.height.encoding["source"]) > 0
assert not Path(dataset.height.encoding["source"]).exists()
# fallback to default registration and gtype
assert dataset.height.gmt.registration == 0
assert dataset.height.gmt.gtype == 0
# manually set the registration and gtype
dataset.height.gmt.registration = 1
dataset.height.gmt.gtype = 1
# the registration and gtype still have default values.
# Quote from https://docs.xarray.dev/en/stable/internals/extending-xarray.html
# > New instances, like those created from arithmetic operations or when
# > accessing a DataArray from a Dataset (ex. ds[var_name]), will have
# > new accessors created.
assert dataset.height.gmt.registration == 0
assert dataset.height.gmt.gtype == 0
Some discussions at https://github.com/pydata/xarray/issues/3205 and https://github.com/pydata/xarray/issues/3268.
This issue should be fixed by https://github.com/GenericMappingTools/pygmt/pull/2009, but please note that the registration and gtype information will be lost after converting to Dataset. See https://github.com/GenericMappingTools/pygmt/issues/499 for context and https://github.com/GenericMappingTools/pygmt/pull/2375 for the known limitations.
Great, thanks! I look forward to using it in v0.9 :)