hlslpp icon indicating copy to clipboard operation
hlslpp copied to clipboard

MSVC AVX2 tests are failing

Open n00bmind opened this issue 5 months ago • 1 comments

Hi! I'm starting to test out the lib for our project (awesome work btw!) and just by chance I realised that the tests for the MS AVX2 build config seem to be failing right away. I dont see any issues with the AVX or SSE4.1 configs.

This is the full console output with the failures:

1) Experiments started
Experiments completed

2) Unit tests started
eq(vlerp1, lerpf(vfoo1.r, vbar1.r, vbaz1.r)) Assertion failed! Values are not equal: a = -2.990000, b = -2.990000, tolerance = 0.000000
eq(vlerp2, lerpf(vfoo2.r, vbar2.r, vbaz2.r), lerpf(vfoo2.g, vbar2.g, vbaz2.g)) Assertion failed! Values are not equal: a = -1.849000, b = -1.849000, tolerance = 0.000000
eq(vlerp3, lerpf(vfoo3.r, vbar3.r, vbaz3.r), lerpf(vfoo3.g, vbar3.g, vbaz3.g), lerpf(vfoo3.b, vbar3.b, vbaz3.b)) Assertion failed! Values are not equal: a = 1.462000, b = 1.462000, tolerance = 0.000000
eq(vlerp4, lerpf(vfoo4.r, vbar4.r, vbaz4.r), lerpf(vfoo4.g, vbar4.g, vbaz4.g), lerpf(vfoo4.b, vbar4.b, vbaz4.b), lerpf(vfoo4.a, vbar4.a, vbaz4.a)) Assertion failed! Values are not equal: a = 1.934000, b = 1.934000, tolerance = 0.000000
eq(vlerpf_4, lerpf(vfoo4.r, vbar4.r, 0.7f), lerpf(vfoo4.g, vbar4.g, 0.7f), lerpf(vfoo4.b, vbar4.b, 0.7f), lerpf(vfoo4.a, vbar4.a, 0.7f)) Assertion failed! Values are not equal: a = 1.729000, b = 1.729000, tolerance = 0.000000
eq(vlerp_swiz_1, lerpf(vfoo1.r, vbar1.r, vbaz1.r)) Assertion failed! Values are not equal: a = -2.990000, b = -2.990000, tolerance = 0.000000
eq(vlerp_swiz_2, lerpf(vfoo2.r, vbar2.r, vbaz2.r), lerpf(vfoo2.g, vbar2.g, vbaz2.g)) Assertion failed! Values are not equal: a = -1.849000, b = -1.849000, tolerance = 0.000000
eq(vlerp_swiz_3, lerpf(vfoo3.r, vbar3.r, vbaz3.b), lerpf(vfoo3.g, vbar3.g, vbaz3.g), lerpf(vfoo3.b, vbar3.b, vbaz3.r)) Assertion failed! Values are not equal: a = -2.676000, b = -2.676000, tolerance = 0.000000
eq(vlerp_swiz_4, lerpf(vfoo4.r, vbar4.r, vbaz4.r), lerpf(vfoo4.g, vbar4.g, vbaz4.g), lerpf(vfoo4.b, vbar4.b, vbaz4.b), lerpf(vfoo4.a, vbar4.a, vbaz4.a)) Assertion failed! Values are not equal: a = 1.934000, b = 1.934000, tolerance = 0.000000
eq(mat_mmul_2x2_2x3, 0.900000036f, 1.20000005f, 1.50000000f, 1.90000010f, 2.59999990f, 3.30000019f ) Assertion failed! Values are not equal: a = 2.600000, b = 2.600000, tolerance = 0.000000
eq(mat_mmul_2x2_2x4, 1.10000002f, 1.40000010f, 1.70000005f, 2.00000000f, 2.29999995f, 3.00000000f, 3.70000005f, 4.40000010f ) Assertion failed! Values are not equal: a = 1.400000, b = 1.400000, tolerance = 0.000000
eq(mat_mmul_3x2_2x2, 0.700000048f, 1.00000000f, 1.50000000f, 2.20000005f, 2.30000019f, 3.40000010f ) Assertion failed! Values are not equal: a = 0.700000, b = 0.700000, tolerance = 0.000000
eq(mat_mmul_3x2_2x3, 0.900000036f, 1.20000005f, 1.50000000f, 1.90000010f, 2.59999990f, 3.30000019f, 2.90000010f, 4.00000000f, 5.10000038f ) Assertion failed! Values are not equal: a = 2.600000, b = 2.600000, tolerance = 0.000000
eq(mat_mmul_3x2_2x4, 1.10000002f, 1.40000010f, 1.70000005f, 2.00000000f, 2.29999995f, 3.00000000f, 3.70000005f, 4.40000010f, 3.50000000f, 4.60000038f, 5.70000029f, 6.80000019f ) Assertion failed! Values are not equal: a = 1.400000, b = 1.400000, tolerance = 0.000000
eq(mat_mmul_4x2_2x2, 0.700000048f, 1.00000000f, 1.50000000f, 2.20000005f, 2.30000019f, 3.40000010f, 3.10000014f, 4.59999990f ) Assertion failed! Values are not equal: a = 0.700000, b = 0.700000, tolerance = 0.000000
eq(mat_mmul_4x2_2x3, 0.900000036f, 1.20000005f, 1.50000000f, 1.90000010f, 2.59999990f, 3.30000019f, 2.90000010f, 4.00000000f, 5.10000038f, 3.90000010f, 5.40000010f, 6.90000010f ) Assertion failed! Values are not equal: a = 2.600000, b = 2.600000, tolerance = 0.000000
eq(mat_mmul_2x3_3x3, 3.00000024f, 3.60000014f, 4.19999981f, 6.60000038f, 8.10000038f, 9.60000038f ) Assertion failed! Values are not equal: a = 3.000000, b = 3.000000, tolerance = 0.000000
eq(mat_mmul_3x3_3x3, 35929.47f, 4440.54f, 134063.46f, 3625.123f, 256.345f, -943.98f, 120437.234f, 6135.154482f, -3005.63867f, 0.001f) Assertion failed! Values are not equal: a = 120437.226562, b = 120437.234375, tolerance = 0.001000
eq(mat_mmul_3x3_3x4, 3.80000019f, 4.40000010f, 5.00000000f, 5.60000038f, 8.30000019f, 9.80000019f, 11.3000002f, 12.8000002f, 12.7999992f, 15.2000008f, 17.5999985f, 20.0000000f ) Assertion failed! Values are not equal: a = 15.200000, b = 15.200001, tolerance = 0.000000
eq(mat_mmul_4x3_3x2, 2.20000005f, 2.80000019f, 4.90000010f, 6.40000010f, 7.60000038f, 10.0000000f, 10.3000002f, 13.6000004f ) Assertion failed! Values are not equal: a = 7.600000, b = 7.600000, tolerance = 0.000000
eq(mat_mmul_4x3_3x3, 3.00000024f, 3.60000014f, 4.19999981f, 6.60000038f, 8.10000038f, 9.60000038f, 10.1999998f, 12.6000004f, 15.0000000f, 13.8000011f, 17.1000004f, 20.4000015f ) Assertion failed! Values are not equal: a = 3.000000, b = 3.000000, tolerance = 0.000000
eq(mat_mmul_4x3_3x4, 3.80000019f, 4.40000010f, 5.00000000f, 5.60000038f, 8.30000019f, 9.80000019f, 11.3000002f, 12.8000002f, 12.7999992f, 15.2000008f, 17.5999985f, 20.0000000f, 17.2999992f, 20.6000004f, 23.9000015f, 27.2000008f ) Assertion failed! Values are not equal: a = 15.200000, b = 15.200001, tolerance = 0.000000
Unit tests completed

(perf tests omitted)

Is this a tolerance issue? Looks like all tests run with a tolerance of 0? If I run Debug, so that it stops on the assertions, this is the first failure point:

Image

I tried evaluating the subexpressions in a watch window, but VS doesnt want to evaluate lerpf for whatever reason..

I'm on an i7 9700K (Coffee Lake), so AVX2 support should be no issue, afaict. Hope this helps.

n00bmind avatar Nov 18 '25 18:11 n00bmind

Hi @n00bmind thanks for reporting. This is a known issue, it's because the AVX2 config enables fmadd functions automatically and the order of operations in a lerp is ever so slightly different, giving different results. I want to add a flag that is something like deterministic results between machines (if you don't have the fmadd instruction in one machine but you do in the other you may get different results) as a feature so these are disabled but never got round to it

I'll leave this open to implement what I just described and have the fmadd instructions maybe as an optional feature for speed

redorav avatar Nov 18 '25 18:11 redorav