Decoding per second units
What went wrong?
When using some old data available through NCEI, the units for Absolute_vorticity_isobaric are stored as /s, which is not able to be correctly parsed by Pint to do any of our operations that require units. The error comes out and is documented in the traceback below. I think we'll be able to do something within the units.py to capture this to help our users, similar to dealing with the percent symbol.
Operating System
MacOS
Version
1.2
Python Version
3.9
Code to Reproduce
# Remote Access to NCEI
ds = xr.open_dataset('https://www.ncei.noaa.gov/thredds/dodsC/model-gfs-g3-anl-files-old/'
'200612/20061201/gfsanl_3_20061201_1200_000.grb')
# Subset data to be just over the U.S. for plotting purposes
ds = ds.sel(lat=slice(70,10), lon=slice(360-150, 360-55)).metpy.parse_cf()
# Isolate absolute vorticity and quantify the object
avor = ds.Absolute_vorticity_isobaric.metpy.quantify()
Errors, Traceback, and Logs
Traceback (most recent call last):
File ~/miniconda3/envs/main/lib/python3.8/site-packages/IPython/core/interactiveshell.py:3251 in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
Input In [5] in <module>
avor = ds.Absolute_vorticity_isobaric.metpy.quantify()
File ~/miniconda3/envs/main/lib/python3.8/site-packages/metpy/xarray.py:211 in quantify
quantified_dataarray = self._data_array.copy(data=self.unit_array)
File ~/miniconda3/envs/main/lib/python3.8/site-packages/metpy/xarray.py:157 in unit_array
return units.Quantity(self._data_array.data, self.units)
File ~/miniconda3/envs/main/lib/python3.8/site-packages/metpy/xarray.py:134 in units
return units.parse_units(self._data_array.attrs.get('units', 'dimensionless'))
File ~/miniconda3/envs/main/lib/python3.8/site-packages/pint/registry.py:1161 in parse_units
units = self._parse_units(input_string, as_delta, case_sensitive)
File ~/miniconda3/envs/main/lib/python3.8/site-packages/pint/registry.py:1391 in _parse_units
return super()._parse_units(input_string, as_delta, case_sensitive)
File ~/miniconda3/envs/main/lib/python3.8/site-packages/pint/registry.py:1182 in _parse_units
units = ParserHelper.from_string(input_string, self.non_int_type)
File ~/miniconda3/envs/main/lib/python3.8/site-packages/pint/util.py:623 in from_string
ret = build_eval_tree(gen).evaluate(
File ~/miniconda3/envs/main/lib/python3.8/site-packages/pint/pint_eval.py:114 in evaluate
raise DefinitionSyntaxError('missing unary operator "%s"' % op_text)
File <string>
DefinitionSyntaxError: missing unary operator "/"
The unit string '/s' isn't valid CF/UDUNITS-2, so should this be considered an upstream issue with THREDDS or NCEI's configuration? Or is this another case like 'gpm' where we should adapt to non-conforming units in common use?
@jthielen good question. After a bit more investigation, it appears to be in issue with the grib1 format used (on at least the GFS 1 degree analysis files), the files stored using grib2 (i.e., .grb2) all appear to be CF-compliant. I'm unsure of what level of work would be necessary to adapt the files upstream. I don't have confidence that there would be a timely way to make the modification upstream and think it might be best for us to treat it like 'gpm' to ensure a better user experience.
Is the workaround here to do:
ds.Absolute_vorticity_isobaric.attrs['units'] = '1/s'
?
Given this is the first time we've hit that, I'm not sure how prevalent the problem is. Right now, the balance of frequency vs. difficulty of working around slants to me in favor of not putting this in MetPy.
Yes, that is a work around, but the error that comes out would in no way lead to understanding that this is even a units issue...took me a while to get to that is the error. These are the types of issues that will crop up from time to time as a result of not having full control over the data being used. I'm okay with not putting something into MetPy to cover this, but then how can we better warn the end user to allow them to more easily get to the needed work around of manually changing the units.
Part of the issue is that this unit error will prevent even doing smoothing as the move to quantify the object occurs when you are using an xarray Dataarray object. So even when there is not calculation being performed, end users could find themselves flummoxed.
🤮
From a UX point-of-view, given that xarray data structures are most likely where such "unclean" metadata would show up (compared to Pint quantities that users create themselves), would it make sense to intercept Pint errors in the xarray unit handling, and issue a more high-level error like "unit string '/s' on 'Absolute_vorticity_isobaric' cannot be parsed"?
That's kind of the solution I was bouncing around in my head.