parse_cf generates "Could not find variable" warnings
What went wrong?
Using .metpy.parse_cf() generates:
Could not find variable corresponding to the value of grid_mapping: LatLon_Projection
multiple times.
Operating System
Linux
Version
1.2.0
Python Version
3.9.9
Code to Reproduce
import xarray as xr
from metpy.cbook import get_test_data
data = xr.open_dataset(get_test_data('GFS_test.nc', False)).metpy.parse_cf()
Errors, Traceback, and Logs
Could not find variable corresponding to the value of grid_mapping: LatLon_Projection
Could not find variable corresponding to the value of grid_mapping: LatLon_Projection
Could not find variable corresponding to the value of grid_mapping: LatLon_Projection
Could not find variable corresponding to the value of grid_mapping: LatLon_Projection
Could not find variable corresponding to the value of grid_mapping: LatLon_Projection
Could not find variable corresponding to the value of grid_mapping: LatLon_Projection
Could not find variable corresponding to the value of grid_mapping: LatLon_Projection
Could not find variable corresponding to the value of grid_mapping: LatLon_Projection
Could not find variable corresponding to the value of grid_mapping: LatLon_Projection
Well, it's not a bug in MetPy. The problem is actually this data that came from THREDDS awhile ago, and it denotes a variable "LatLon_Projection" should be used for the grid mapping information, but that variable is not found anywhere in the find. So strictly speaking, the warning is due to the fact that the file isn't CF-compliant.
The fix would be to fix the file, though I have concerns as to whether that will bloat the repository. Is there some problem this is causing for you?
The only real problem is it pollutes the tutorial output here: https://unidata.github.io/MetPy/latest/tutorials/declarative_tutorial.html
But perhaps there is a different test file that tutorial could rely on, to avoid the warnings.
This kind of issue (repeated warnings) makes me question the current design of parse_cf(). Right now, using parse_cf with varname left as its default None is treated as "apply parse_cf separately to all data variables" rather than "parse this dataset as a whole," so if that was clearly communicated to and understood by users, repeated warnings like this would be reasonably expected. However, based on practical usage, parse_cf() on its own really does seem to imply "parse this dataset as a whole," and just one warning should be issued. Should we rework the internals of parse_cf to prioritize such a "whole dataset" perspective?
👍 To @jthielen 's suggestion. Looking at the file in question, it's only 2.4M, so maybe fixing it isn't so awful either. (A fresh checkout of the MetPy repo is currently 1.1G 😱 )
It appears that 601 MB is in .git/objects/pack
#2531 adds the variable to silence the warning. I'm leaving this open to deal with this behavior at the API level, but bumping from our milestone.