Generalize vorticity (and related functions) to allow for map projections
Some MetPy functions, such as gradient and advection, allow the user to supply arrays of deltas rather than a single value, but vorticity (and its relatives) does not. Vorticity is more complicated since derivatives of the "map factors" are also required, but it should be doable.
Here is the necessary computation according to the GEMPAK documentation (Appendix B2):
Since the map factors (m_x and m_y) are equivalent to the "nominal" grid spacing (\partial x and \partial y in the above equation) divided by the actual delta, an additional "nominal_delta" argument to vorticity would allow for the map factors to be computed. Since the WRF model provides the map factors in its output, allowing the map factors (and the nominal grid spacing) as optional arguments to vorticity in lieu of deltas would be useful as well.
So while the docs for vorticity currently say that dx/dy have to be float (oops, that needs fixing), with the new finite differencing in 0.7, we can now handle vorticity for arbitrarily-spaced points. If you've tried to pass an array in for dx/dy, and it failed, that's a bug.
As far as the mapping/projection issues are concerned, that's still an open issue (covered in #174). So I'm a little unclear--is the map factor there to handle irregularly-spaced points or to handle issues from projection? or both?
OK, I see that vorticity does compute something when I provide arrays for dx/dy.
Regarding the map factors, they are fundamentally there to handle issues from projection. One of those issues is that the data array is irregularly spaced, but another issue is that computing vorticity (more generally, the curl of a vector) in a quantitatively correct manner requires the map factors.
The current implementation of vorticity is correct if the underlying data is truly in Cartesian coordinates (even if unequally spaced). I suppose output from a cloud model would fit this description.
However, when the data is not truly Cartesian (e.g., NWP output on some map projection), the third and fourth terms above are necessary for a quantitatively correct calculation. In many (most?) instances, the third and fourth terms will be quite small, certainly not noticeable qualitatively (say, when looking at contours), but one case where the difference is noticeable is when vorticity is computed from data on a lat/lon grid and examined in polar regions (where, due to convergence of the meridians, the dmx/dy term gets large).
For instance, compare the GEMPAK and MetPy-derived plots below (Python code is at https://gist.github.com/sgdecker/496f7ea7edd98b428ac3adab0b5e0842; GEMPAK is at https://gist.github.com/sgdecker/1a0bbf2e18dd0a2f9b4b02c83e07509b):
Look closely and you'll see, for instance, that the 4 contour to the left (south, technically) of the vort max in the Arctic Ocean differs noticeably.
A mathematical way to see this is to note that vorticity in spherical coordinates is given by (with theta as colatitude):
which expands to:
The final term (which corresponds to the dmx/dy term in GEMPAK's general expression) blows up at the pole, but if we are 1 degree away, the cotangent is order 10^2, so using U ~ 10 m/s, and r ~ 10^7 m, the term is on the order of 10^-4 s^(-1).
The divergence and laplacian operators will have the same issue.
Thanks for the detailed analysis. It's what I figured, and definitely on our roadmap. Due to the nicely detailed analysis, let's leave this open for addressing the map factors, and we can close out #174.
(Putting the link to the NOAA writeup about map projections here: https://www.arl.noaa.gov/hysplit/cmapf-mapping-routines/)
Nice resource in the ARL guide. I do want to suggest caution in retaining 3D coordinates as much as possible, as I advocated in the Cartopy PlateCaree saga - while the ARL guide has nice routines for 2D, and that's fine for maps, it can make forward/inverse and mixed-coordinate plotting difficult.
Anyway, proj.4 has a 3D-aware forward / inverse, and a quick search turns up partial derivative support through the proj_factors function - maybe that's useful in ensuring we account for what @sgdecker points out above. This issue on OSGeo also seems related.
Since I mentioned divergence and the Laplacian operator, I went ahead and made plots of those as well to show the difference between GEMPAK and MetPy.
Here is 500-mb divergence:
Here is the Laplacian of 500-mb height:

Perusing the GEMPAK documentation, I found other calculations that also require taking derivatives of map factors to get absolutely correct results:
- Horizontal partial derivatives of vectors (but not scalars), which in turn affects:
- Q-vectors
- Deformation (total, stretching, and shearing), which in turn affects:
- Frontogenesis (also depends on divergence)
- Inertial advection, which in tern affects:
- Rossby number
- Divergence-dependent calculations, such as:
- Flux divergence
- Layer-average mass divergence
- Vorticity-dependent calculations, such as:
- Absolute vorticity
- Potential vorticity
- Poisson equation solver
Okay, so I've been looking into this and have the following to report on absolute vorticity...
Here is the difference between GEMPAK and MetPy v0.10; note the very small range of color values

The average difference is 0.0051 /s. If the following term is added + (uwnd / (6375000 * units.meter)) * np.tan(np.deg2rad(lats[:, None])) the difference is reduced by an order of magnitude to essentially zero.

The average difference is -0.00042. Note: The single strip of color at the poles is because MetPy's derivative does not assess vorticity at the top and bottom of the array. Excluding those rows does not appreciably affect the mean difference calculations.
The additive factor depends on the u-component of the wind and latitude and it is not a universal or 3D solution to the issue as it will has the potential to be slightly different for some of the calculations (e.g., laplacian). Since the list is "relatively" small, this can be done for the various cases and added, however with our current implementation it would require the addition of potentially two arguments. One argument signaling to use (or not use) the spherical calculation and another that contains the latitude array. This would not be a problem for absolute_vorticity since it already requires the latitudes for the calculation of coriolis_parameter, therefore we would only need to add the spherical designation. I would additionally promote the default use of spherical to beTrue since that would be the much large use case for our current user base.
I haven't tried implementing this yet, but I think the general solution would be to accept the data CRS object as an optional argument, rather than more specific things such as latitude. If that argument is not provided, assume the grid is Cartesian, and set the map factors to one. Otherwise, use the CRS object to obtain the map factors via proj4 (as discussed by @deeplycloudy). In either case, use the formula in my original comment as the basis for the computation. The part that is unclear to me without actually trying this yet is how difficult obtaining the map factors from the CRS is.
This question is more related to the information needed for projection-aware deriviative-based calculations, rather than the implementation, but I still wanted to ask, since it may end up being useful for the implementation.
I've recently been planning out a collection of "dataset helpers" to flesh out any missing CF metadata in xarray Datasets, so that all of the MetPy-xarray integration functionality (projections, coordinates, units, and hopefully soon variable identification and the field solver) works as smoothly as possible. WRF output has been the motivating example (see https://github.com/Unidata/MetPy/issues/1004).
With that in mind, what would be the best set of information to standardize having in a DataArray and/or Dataset to ensure that these projection-aware calculations can be carried out? Is it sufficient to have
- projection x and y coordinates with a
crs? - latitude and longitude dimension coordinates (if plate carrée/equirectangular) or auxiliary coordinates (if not)?
or am I missing something else?
The goal, I would think, would be to be able to calculate something like vorticity with a call such as
vorticity = mpcalc.vorticity(data['u_wind'], data['v_wind'])
and have everything else be obtained automatically from the DataArrays. This, if combined with some way of annotating the function and its arguments, would also really set the stage for the field solver.
Correct, my ideal is to have the CRS information, as well as coordinates like x or lon, attached to the data arrays. If we can't get map factors from the CRS...that would be bad.
While I had hoped that #1260 (which allows for easy calculation of both nominal and actual grid deltas from a parsed DataArray) would be enough to make this feasible for quick implementation for v1.0, #1275 raised the point about grid-relative vs. earth-relative winds that I had not fully considered with respect to this issue. So,
- What is the correct behavior for these kinematic calculations with respect to earth-relative vs. grid-relative winds?
- When working with geostrophic wind, how do we know when the result should be grid-relative (like it is now) or earth-relative (like would be needed for ageostrophic wind when the observed wind is earth-relative)?
- Do we have any test data with expected results for these calculations that we can rely on here?
With all this extra complexity, though, I unfortunately don't see a way that this can feasibly be resolved with a stable API for v1.0 in the immediate future. So, will this issue need to be punted from v1.0, and instead, for now, just use a simple xarray input/output wrapper that fills in the default grid spacing (dx and dy) and coordinate-dependent parameters (f and latitude), and calculates these kinematic functions as they are now?
The equation way back in the first comment is valid for grid-relative u and v. If you used earth-relative winds on, say, a polar stereographic projection where North America is right-side up but Asia is upside down (so that grid-relative and earth-relative winds are nearly equal and opposite), I'd suspect adding the map factor terms would actually degrade the computation.
@jthielen I agree about punting from 1.0. I'll do some milestone work later, but it's pretty clear we're not going to get the 1.0 we want, but the one we deserve. :wink: We can talk more when you get into AMS. Let me know.
@jthielen I agree about punting from 1.0. I'll do some milestone work later, but it's pretty clear we're not going to get the 1.0 we want, but the one we deserve. We can talk more when you get into AMS. Let me know.
Sounds good. I'll be getting in tomorrow afternoon. We can chat via email if you want to find a more specific time, otherwise I'll plan on stopping by the Unidata table at the career fair that evening.
This notebook from @deeplycloudy may help shed some light on using PyProj for some complicated reprojections.
@sgdecker would you be available to join our biweekly collaborators call (#1684) for some live exploration of potential solutions we have been discussing lately surrounding this issue?
@kgoebber shared this notebook with us. We'll be using these two notebooks as a starting point for this discussion at our next dev call!
@dcamron Yes, I can join in on July 8.
For reference in advance of our discussion, I worked a bit more with @kgoebber's notebook and added a bunch of notes while trying to reconcile the two approaches to calculating curvilinear gradients. I wasn't entirely successful but hopefully my prose can help someone spot the way forward. I've forked and updated the notebook uploaded by @dcamron.
Using GEMPAK's value for the radius of the Earth marginally improves (by 0.02% or so, consistent with the radius differing in the fifth significant digit) the correspondence between MetPy and GEMPAK values in method one. I though switching to 32-bit arithmetic might further increase the match, but it does not. I am guessing the remaining difference comes from the loss of precision that occurred when generating the .out files.
Calculating the map factors as
m_x = 1 + (factors.parallel_scale - 1) / mpconst.earth_avg_radius.m
m_y = 1 + (factors.meridional_scale - 1) / mpconst.earth_avg_radius.m
brings things closer to GEMPAK in method two (RMS reduced from 0.03303 to 0.00108), but not nearly as close as method one (0.00002). This was a hunch on my part, but not having a good feel for what the pyproj meridional_scale and parallel_scale actually are yet, this transformation could be off base.
So based on today's dev call, here's the plan of action:
- First implement the additional terms in gradient, divergence, vorticity (vertical component of curl), and scalar laplacian to account for operating on a spheroid--target 1.1 for this functionality
- This will put the corrections in the functions
- Add optional function arguments for latitude and CRS (to give us radius--defaults to our constant for radius) on those functions (Use the presence of latitude to trigger the correction)
- Add handling for CRS to
add_grid_arguments_from_xarray - This allows us to get accurate results by going from projected coordinates to lat/lon and doing that calculations in lat/lon space
- In the future use PROJ's map factors to implement the calculations directly in projected coordinates more efficiently (provided we get everything right using factors)
- This might also allow us to properly work with grid-relative winds (#1047)--if the corrections are necessary with grid-relative winds (depends on whether the grid is Cartesian or not?)
(cc @jthielen @deeplycloudy @kgoebber )
Okay, so I've created an initial PR, please feel free to take it from there.
As I was working on this I delved into some NAM data and the vorticity calculation. Interestingly, the calculation matches what GEMPAK produces when no correction is made. The NAM data that I am using to test has grid-relative winds and does need to be converted to plot correctly, however, if you do the conversion for the calculation, it degrades the calculation. I haven't been able to unlock any combination of correction that improves over the base calculation that is already in MetPy. I haven't had time yet to create a refined notebook - I'll do that in the next day or two and update with the NAM example for others to peruse.
Here is a cleaned up notebook that I created to demonstrate what I have tried: https://gist.github.com/kgoebber/a5482e8e8c51396e80ffe3293696f57a
Makes sense that the spherical correction is no good for NAM 218, as that isn't a spherical grid. The fact that the last approach based on the map factors doesn't work so well is the mystery. I will be investigating further, but there is a nice big picture overview at https://gis.stackexchange.com/questions/122703/influence-of-the-scale-factor-on-the-projection (first answer) that I think has important info, but it is a matter of translating it into proj4 language to determine what is missing. My current hunch is that something involving the "nominal scale" needs to be accounted for.
My investigations are cataloged here: https://github.com/sgdecker/testing/blob/188ed819b4b464d5a3c1b83034435f0e6b9d800c/MapProjectionExploration.ipynb
I can now reproduce the GEMPAK calculation on Grid 218 using the proj4 map factors. I am not sure what I am doing differently from @kgoebber actually, but it works! Details are in this notebook, and the data files are in the repository.
Summary notebook demonstrating a working vorticity function for Grid 218 is here. Still need to revisit GFS global data and look at a stereographic grid.
Just a side question, since such good progress is being made on the implementation side: for my own understanding's sake (and probably also documentation in the eventual implementation), is there a good reference for how these map-factor-based formulations come about? The GEMPAK appendix's "left as an exercise to the reader" is not very satisfactory, especially since it requires a working knowledge of differential geometry.
It is at heart differential geometry, but you can see the general expression in the table here (good luck going through the derivations to get to that table!), where h1 is mx, h2 is my, h3 is 1 (no transformation in the vertical), v1 is u, v2 is v, and we only care about the vertical component (i = 3). The expression for vorticity that the GEMPAK documentation shows should emerge from the expression in the table corresponding to the curl of a vector field with the aforementioned substitutions.
Edit: Actually, we have orthogonal coordinates, so the info above, while correct, is more general than it needs to be. This page is better.
Thanks for this additional investigation @sgdecker! I was able to match your calculation by using just the parallel_scale and meridional_scale for the map factors. I was originally using the modification that you had explored in a previous notebook.