Modification to integrated test tolerance scheme.
Is your feature request related to a problem? Please describe. Currently, the tolerance scheme is as follows:
Entries x1 and x2 are considered equal iff
|x1 - x2| <= ATOL or |x1 - x2| <= RTOL * |x2|.
To measure the degree of difference a scaling factor q is introduced. The goal is now to minimize q such that
|x1 - x2| <= ATOL * q or |x1 - x2| <= RTOL * |x2| * q.
If RTOL * |x2| > ATOL
q = |x1 - x2| / (RTOL * |x2|)
else
q = |x1 - x2| / ATOL.
If the maximum value of q over all the entries is greater than 1.0 then the arrays are considered different and an error message is produced.
This is insufficient to catch diffs across all data that will have different scales. For instance, stress typically has a magnitude on order of 1e6, while displacements have magnitude on order of 1e-3. So the ATOL isn't really applicable for both scales. For example using a RTOL=1e-6, ATOL=1e-6:
For a stress field with a maximum value of 1e6, we can have an entry with x1=1.001, x2=1.0. This will fail RTOL,and fails ATOL. However this is essentially equal to the 9th digit when considering that the maximum value of the field is 1e6. To pass we would need an ATOL>1.0e-3.
In contrast a displacement field with a maximum value of 10e-3 and x1=1.1e-7, x2=1.0e-7. This fails RTOL, but passes ATOL....so these would be considered equal even though there is a pretty large diff when compared to the maximum value of 10e-3.
Describe the solution you'd like
|x1 - x2| <= ATOL * (1 + max(|x2|) )
or
|x1 - x2| <= RTOL * |x2|
where max(\x2\) is the maximum value of all x2 in the problem. This effectively normalizes |x1 - x2| s.t. it may be compared with unity, and thus ATOL will be relative to 1....which essentially is a measure of the number of significant digits we expect the values to have.
With this approach, the previous example would pass for stress, and fail for displacement...which is more appropriate.
Describe alternatives you've considered Using a really low tolerance.
Additional context Please suggest any other alternatives.
Another alternative is what silodiff does...which is pretty well proven:
-x EPS, --epsilon=EPS
= When non-negative, all relative differencing epsilon
= parameters are set to EPS and an alternate relative
= difference scheme is used where two numbers, A and B, are
= different if
= |A-B|/(|A|+|B|+EPS)>RTOL. This sets the internal
= variables `$diff_*_eps' where `*' is one of the following
= words: int8, short, int, long float, or double. For
= EPS=0, the algorithm is similar (but not identical) to
= `normal' relative differencing. But for EPS=1, it behaves
= in such a way as to shift between this alternate relative
= differencing for large numbers and absolute differencing
= for numbers near zero. Default is -1 (e.g. turned off).
We could also use numpy's built in method
https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.allclose.html
which evaluates |A-B| > ( EPS + RTOL * |B| )
We could also use numpy's built in method https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.allclose.html which evaluates
|A-B| > ( EPS + RTOL * |B| )
The reference says it evaluates |A-B| <= ( ATOL + RTOL * |B| )
@rrsettgast Yeah I modified the notation to match the silodiff docs.
Example:
In a mechanics problem, Stress in one part of the problem (cell A) is 1e6. Stress in another part of the problem (cell B) is 1e-3. Your linear solver is going to solve to a certain residual norm, which is essentially a measure of nodal force....which is an integration of the stress...so I am going to use force and stress interchangeably. Lets' say the residual tolerance is 1.0e-8, and assume that this roughly correlates with driving the residual down to ~1e-2. What would that mean when we do a point wise diff on on the portion of the problem where the stress is 1e-3 (cell B)? Lets say the value changed from a baseline of 1e-3 to a new value of 1.1e-3 (which is under the residual tolerance) and we are using ATOL=1.0e-6 and RTOL=1.0e-6
With the numpy.allclose() we would have a diff if:
| 1.1e-3 - 1.0e-3 | > ( 1.0e-6 + 1.0e-6 * 1.0e-3 )
1.0e-4 > 1.001e-6
So that would fail the check.
If you do what I proposed....
| 1.1e-3 - 1.0e-3 | > 1.0e-6 * ( 1.0 + 1e6 )
1.0e-4 > 1.0'ish
pass.
And the RTOL check:
| 1.1e-3 - 1.0e-3 | > 1.0e-6 * 1.0e-3
1.0e-4 > 1.0e-9
would fail...but you passed the ATOL check so the check passes.
Alternatively we could just make a single check with a single tolerance...
|x1 - x2| > (1 + max( max(|x2|), |x2| ) ) * RTOL
| 1.1e-3 - 1.0e-3 | > ( 1.0 + 1e6 ) * 1.0e-6
1.0e-4 > 1.0'ish
...but we would have to experiment a little bit.
I'm just a little skeptical of scaling our tolerances to the largest value in the baseline data set. If all we cared about was the linear solver stuff then we should just take the L2 norm of the diff and compare that with a single tolerance value. But I don't like that idea much either.
I'm just a little skeptical of scaling our tolerances to the largest value in the baseline data set. If all we cared about was the linear solver stuff then we should just take the L2 norm of the diff and compare that with a single tolerance value. But I don't like that idea much either.
Well, there is no doubt that this proposal should be met with some healthy skepticism. I view it as an unfortunate reality of the nature of the data that we are trying to diff. These are not really independent data points that we are comparing. The stress example is intended to illustrate this point, and the proposal does specially seek to loosen ATOL for values where the precision is insufficient to ever achieve a tight ATOL. These cases are listed (but not necessarily limited to):
- The baseline value has insufficient precision when compared to the rest of the governing tolerances for the numerical method (e.g. value is much smaller than other field values, AND a these values are related thorough a non/linear residual tolerance...which is likely an L2 norm)
I don't know that the L2 norm of the diff would fix this?
Taking the L2 norm of the diff In the previous example:
baseline:
xb = ( 1.0e-3, 1.0e6 )
new value:
x = ( 1.1e-3, 1.00000001e6 )
xdiff = ( 1.0e-4, 1.0e-2 )
| xdiff |_2 = 0.0100005 > TOL
but I am not sure this is what you intended. Perhaps we want to propose a set of difficult cases for which the diff scheme must work.
One complementary option is we could allow a test to provide different tolerances for different fields.
defaultAtol = 1e-9
defaultRtol = 1e-9
specializations = [(".*/d.*_d.*", (1e-4, 1e-7)]
What about using the cosine of the angle between the two vectors to compare them?
dotProduct( x, y ) / |x||y| < 1 - tolerence
The example above passes with a vanishing tolerance.
@cssherman @CusiniM Given the updated ATS, should we revisit this or close it ?
@rrsettgast - The updates to ATS won't impact how the diffs are calculated, since we define them ourselves in these scripts:
- https://github.com/GEOS-DEV/integratedTests/blob/develop/scripts/geos_ats_package/geos_ats/helpers/restart_check.py
- https://github.com/GEOS-DEV/integratedTests/blob/develop/scripts/geos_ats_package/geos_ats/helpers/curve_check.py
We get around these issues with the curve checks by allowing either a single tolerance or a list of tolerance values that is the same size as the curve definitions (probably not scalable for restart checks, TBH). We then use this check:
dx = target - baseline
diff = np.sqrt(np.sum(dx * dx)) / dx.size
if (diff > tolerance):
errors.append(f'{modifier}_{parameter_name}_{set_name} diff exceeds tolerance: ||t-b||/N={diff}, {modifier}_tolerance={tolerance}')