Generated operations are in a different order on different platforms
Describe the bug Operations are placed in a different order on different platforms. This can result in unexpected numerical issues in some cases.
Testing details:
- Linux:
Ubuntu 22.04, Python 3.8.16, x86 - Mac:
OSX Ventura 13.3.1, Python 3.8.16, Apple M2/ARM - Symforce
v0.8.0, installed from pip
To Reproduce
- Run the following script to generate a test function on both OSX and Ubuntu:
from pathlib import Path
import symforce
symforce.set_symbolic_api('symengine')
symforce.set_epsilon_to_number(1.0e-16)
from symforce import geo
from symforce import symbolic as sm
from symforce.codegen import Codegen, CppConfig
from symforce import typing as T
def test_function(time: T.Scalar) -> geo.V4:
rate = sm.pi / 6
angle = rate * time
x_axis = geo.V3([-sm.cos(angle), -sm.sin(angle), 0.0])
z_axis = geo.V3([0, 0, 1])
y_axis = z_axis.cross(x_axis)
world_R_body = geo.Matrix33.column_stack(x_axis, y_axis, z_axis)
world_R_body_quat = geo.Rot3.from_rotation_matrix(R=world_R_body)
quat_elements = geo.V4(world_R_body_quat.to_storage())
world_R_body_subbed = world_R_body.subs(time, 3.01).evalf()
quat_elements_subbed = quat_elements.subs(time, 3.01).evalf()
print(world_R_body_subbed)
print(quat_elements_subbed)
return quat_elements
if __name__ == '__main__':
cg = Codegen.function(func=test_function, config=CppConfig())
path = Path(__file__).parent.absolute()
cg.generate_function(output_dir=path / 'output')
- Observe the output of the python script (the values of
world_R_body_subbedandquat_elements_subbed)
[0.00523596383141919, 0.999986292247427, 0.0]
[-0.999986292247427, 0.00523596383141919, 0.0]
[0.0, 0.0, 1.0]
[0.0]
[0.0]
[-0.705253158861618]
[0.708955557080773]
- Observe that the order of operations is different on the two platforms. This is the exact diff:
35,42c35,43
< const Scalar _tmp4 = -_tmp1;
< const Scalar _tmp5 = -std::max<Scalar>(1, std::max<Scalar>(_tmp4, _tmp3 + 1));
< const Scalar _tmp6 = 1 - std::max<Scalar>(0, -(((_tmp4 + _tmp5) > 0) - ((_tmp4 + _tmp5) < 0)));
< const Scalar _tmp7 = std::min<Scalar>(1, _tmp6);
< const Scalar _tmp8 = std::min<Scalar>(_tmp6, 1 - std::max<Scalar>(0, _tmp7));
< const Scalar _tmp9 = _tmp5 + 1;
< const Scalar _tmp10 = std::min<Scalar>(1 - std::max<Scalar>(0, std::max<Scalar>(_tmp7, _tmp8)),
< 1 - std::max<Scalar>(0, -(((_tmp9) > 0) - ((_tmp9) < 0))));
---
> const Scalar _tmp4 = _tmp3 + 1;
> const Scalar _tmp5 = -_tmp1;
> const Scalar _tmp6 = -std::max<Scalar>(1, std::max<Scalar>(_tmp4, _tmp5));
> const Scalar _tmp7 = 1 - std::max<Scalar>(0, -(((_tmp5 + _tmp6) > 0) - ((_tmp5 + _tmp6) < 0)));
> const Scalar _tmp8 = std::min<Scalar>(1, _tmp7);
> const Scalar _tmp9 = std::min<Scalar>(_tmp7, 1 - std::max<Scalar>(0, _tmp8));
> const Scalar _tmp10 =
> std::min<Scalar>(1 - std::max<Scalar>(0, std::max<Scalar>(_tmp8, _tmp9)),
> 1 - std::max<Scalar>(0, -(((_tmp6 + 1) > 0) - ((_tmp6 + 1) < 0))));
46,47c47,48
< 1 - std::max<Scalar>(0, std::max<Scalar>(_tmp10, std::max<Scalar>(_tmp7, _tmp8))),
< 1 - std::max<Scalar>(0, -(((_tmp3 + _tmp9) > 0) - ((_tmp3 + _tmp9) < 0))));
---
> 1 - std::max<Scalar>(0, std::max<Scalar>(_tmp10, std::max<Scalar>(_tmp8, _tmp9))),
> 1 - std::max<Scalar>(0, -(((_tmp4 + _tmp6) > 0) - ((_tmp4 + _tmp6) < 0))));
53,54c54,55
< _res(0, 0) = (Scalar(1) / Scalar(2)) * _tmp7 * (1 - _tmp7);
< _res(1, 0) = (Scalar(1) / Scalar(2)) * _tmp8 * (1 - _tmp8);
---
> _res(0, 0) = (Scalar(1) / Scalar(2)) * _tmp8 * (1 - _tmp8);
> _res(1, 0) = (Scalar(1) / Scalar(2)) * _tmp9 * (1 - _tmp9);
- Run the C++ test code and evaluate at
time=3.01:
#include <fmt/format.h>
#include "output/cpp/symforce/sym/test_function.h"
int main() {
const double time = 3.01;
auto v = sym::TestFunction(time);
fmt::print("time = {:>6}, v = {:>6}, {:>6}, {:>6}, {:>6}\n", time, v[0], v[1],
v[2], v[3]);
return 0;
}
- Observe that the numerical output is non-trivially different:
Mac version (resultant quaternion has zero norm):
time = 3.01, v = 0, 0, 0, 0
Linux version (resultant quaternion matches python code):
time = 3.01, v = 0, 0, -0.7052531588616179, 0.7089555570807733
NOTE: In this particular case, there may additionally be a bug in Rot3.from_rotation_matrix - the epsilon value is unused. This example just highlights how unpredictable behavior emerges from this issue.
Expected behavior Ideally order of operations would be identical on all operating systems.
Environment
- OS and version: Ubuntu 22.04, OSX Ventura
- Python version: 3.8
- SymForce Version 0.8
This is a weird one - we had similar issues in the past when SymEngine was built with different implementations of std::unordered_map. I'm a little surprised we haven't seen this, we should definitely fix
NOTE: In this particular case, there may additionally be a bug in Rot3.from_rotation_matrix - the epsilon value is unused. This example just highlights how unpredictable behavior emerges from this issue.
Yep, good callout - this is not a bug but is definitely surprising - our current implementation of from_rotation_matrix doesn't require an epsilon, but we still have the argument mostly to not break the API. In general we aim for our generated code to be identical everywhere, and treat it as a bug if it's not, regardless of whether evaluated numerical results are meaningfully different