Problems feeding data to operational model: Target variable geopotential_at_surface must be time-dependent
Hi, I am trying to execute the graphcast operational model with my own data and it seems to be a problem with the xarray object I build with operational data.
When I run an script that get the input_data from google cloud, it works just fine, and those data look like this:
(Pdb) eval_inputs
<xarray.Dataset>
Dimensions: (batch: 1, time: 2, lat: 721, lon: 1440,
level: 13)
Coordinates:
* lon (lon) float32 0.0 0.25 0.5 ... 359.5 359.8
* lat (lat) float32 -90.0 -89.75 ... 89.75 90.0
* level (level) int32 50 100 150 200 ... 850 925 1000
* time (time) timedelta64[ns] -1 days +18:00:00 00...
Dimensions without coordinates: batch
Data variables: (12/17)
2m_temperature (batch, time, lat, lon) float32 250.3 ... 2...
mean_sea_level_pressure (batch, time, lat, lon) float32 9.936e+04 ....
10m_v_component_of_wind (batch, time, lat, lon) float32 -0.4746 ......
10m_u_component_of_wind (batch, time, lat, lon) float32 -5.817 ... ...
temperature (batch, time, level, lat, lon) float32 238....
geopotential (batch, time, level, lat, lon) float32 1.98...
... ...
year_progress_sin (batch, time) float32 0.006986 0.01129
year_progress_cos (batch, time) float32 1.0 0.9999
day_progress_sin (batch, time, lon) float32 0.0 ... 1.0
day_progress_cos (batch, time, lon) float32 1.0 ... 0.004363
geopotential_at_surface (lat, lon) float32 2.735e+04 ... -0.07617
land_sea_mask (lat, lon) float32 1.0 1.0 1.0 ... 0.0 0.0 0.0
And when I build my xarray object looks like this:
(Pdb) input_data
<xarray.Dataset>
Dimensions: (lat: 721, lon: 1440, time: 2, level: 13,
batch: 1)
Coordinates:
* lat (lat) float64 -90.0 -89.75 ... 89.75 90.0
* lon (lon) float64 -180.0 -179.8 ... 179.5 179.8
* time (time) timedelta64[ns] -1 days +18:00:00 00...
* level (level) float64 50.0 100.0 ... 925.0 1e+03
* batch (batch) int64 1
Data variables: (12/16)
temperature (batch, time, lat, lon, level) float32 239....
u_component_of_wind (batch, time, lat, lon, level) float32 1.65...
v_component_of_wind (batch, time, lat, lon, level) float32 -14....
geopotential (batch, time, lat, lon, level) float32 1.98...
specific_humidity (batch, time, lat, lon, level) float32 3.09...
10m_v_component_of_wind (batch, time, lat, lon) float32 -0.6771 ......
... ...
mean_sea_level_pressure (batch, time, lat, lon) float32 9.939e+04 ....
toa_incident_solar_radiation (batch, time, lat, lon) float64 554.8 ... 0.0
year_progress_sin (batch, time) float64 -0.008601 0.0
year_progress_cos (batch, time) float64 1.0 1.0
day_progress_sin (batch, time, lon) float64 -1.0 -1.0 ... 0.0
day_progress_cos (batch, time, lon) float64 -1.837e-16 ... 1.0
The problem is that when I try to run the model with the rollout.chunked_prediction method with the eval_inputs data it works just fine, but when I use my input_data get the following error:
Traceback (most recent call last):
File "/home/eloy.anguiano/repos/graphcast/1.get_data.py", line 342, in <module>
predictions = rollout.chunked_prediction(
File "/home/eloy.anguiano/repos/graphcast/graphcast/rollout.py", line 68, in chunked_prediction
for prediction_chunk in chunked_prediction_generator(
File "/home/eloy.anguiano/repos/graphcast/graphcast/rollout.py", line 164, in chunked_prediction_generator
predictions = predictor_fn(
File "/home/eloy.anguiano/repos/graphcast/1.get_data.py", line 199, in <lambda>
return lambda **kw: fn(**kw)[0]
File "/home/eloy.anguiano/miniconda3/envs/graphcast_iic/lib/python3.10/site-packages/haiku/_src/transform.py", line 456, in apply_fn
out = f(*args, **kwargs)
File "/home/eloy.anguiano/repos/graphcast/1.get_data.py", line 165, in run_forward
return predictor(inputs, targets_template=targets_template, forcings=forcings)
File "/home/eloy.anguiano/repos/graphcast/graphcast/autoregressive.py", line 163, in __call__
self._validate_targets_and_forcings(targets_template, forcings)
File "/home/eloy.anguiano/repos/graphcast/graphcast/autoregressive.py", line 103, in _validate_targets_and_forcings
raise ValueError(f'Target variable {name} must be time-dependent.')
ValueError: Target variable geopotential_at_surface must be time-dependent.
I seems a bit strange as both datasets have that variable not time dependant, so I would like to know If there is anything else wrong with the data that raises this error by any chance. Here is the problematic variable at both variables: Tutorial data
(Pdb) eval_inputs.geopotential_at_surface
<xarray.DataArray 'geopotential_at_surface' (lat: 721, lon: 1440)>
array([[ 2.7354750e+04, 2.7354750e+04, 2.7354750e+04, ...,
2.7354750e+04, 2.7354750e+04, 2.7354750e+04],
[ 2.7163490e+04, 2.7165285e+04, 2.7167082e+04, ...,
2.7159000e+04, 2.7159898e+04, 2.7161693e+04],
[ 2.6957861e+04, 2.6961453e+04, 2.6965045e+04, ...,
2.6949779e+04, 2.6952475e+04, 2.6956066e+04],
...,
[-1.8730469e+00, -1.8730469e+00, -1.8730469e+00, ...,
-1.8730469e+00, -1.8730469e+00, -1.8730469e+00],
[ 4.4121094e+00, 4.4121094e+00, 4.4121094e+00, ...,
4.4121094e+00, 4.4121094e+00, 4.4121094e+00],
[-7.6171875e-02, -7.6171875e-02, -7.6171875e-02, ...,
-7.6171875e-02, -7.6171875e-02, -7.6171875e-02]], dtype=float32)
Coordinates:
* lon (lon) float32 0.0 0.25 0.5 0.75 1.0 ... 359.0 359.2 359.5 359.8
* lat (lat) float32 -90.0 -89.75 -89.5 -89.25 ... 89.25 89.5 89.75 90.0
My data
(Pdb) input_data.geopotential_at_surface
<xarray.DataArray 'geopotential_at_surface' (lat: 721, lon: 1440)>
array([[ 2.7109883e+04, 2.7109883e+04, 2.7109883e+04, ...,
2.7109883e+04, 2.7109883e+04, 2.7109883e+04],
[ 2.7554883e+04, 2.7553883e+04, 2.7551883e+04, ...,
2.7561883e+04, 2.7559883e+04, 2.7556883e+04],
[ 2.8437883e+04, 2.8431883e+04, 2.8425883e+04, ...,
2.8454883e+04, 2.8448883e+04, 2.8442883e+04],
...,
[-5.1181641e+00, -5.1181641e+00, -5.1181641e+00, ...,
-4.1181641e+00, -4.1181641e+00, -4.1181641e+00],
[ 1.0881836e+01, 9.8818359e+00, 9.8818359e+00, ...,
1.0881836e+01, 1.0881836e+01, 1.0881836e+01],
[ 1.8818359e+00, 1.8818359e+00, 1.8818359e+00, ...,
1.8818359e+00, 1.8818359e+00, 1.8818359e+00]], dtype=float32)
Coordinates:
* lat (lat) float64 -90.0 -89.75 -89.5 -89.25 ... 89.25 89.5 89.75 90.0
* lon (lon) float64 -180.0 -179.8 -179.5 -179.2 ... 179.2 179.5 179.8
Could it be the longitude values that raises an uncontrolled error? Does anyone know any tip to continue?
It looks like your lon values are (-180, 180) instead of (0, 360). I'm not sure if that matters, but it certainly looks suspicious.