ispc icon indicating copy to clipboard operation
ispc copied to clipboard

The compiler isn't properly reducing expressions to eliminate `vdivps` in favor of `vmulps`

Open pbrubaker opened this issue 3 years ago • 1 comments

https://godbolt.org/z/Go9v73v13

The compiler isn't correctly evaluating this expression and folding it down to:

col.a = a * 0.00392157f;

However it works if correctly if 1.f/255.f is defined as a constant.

const varying float one_over_255 = 1.0f/255.0f;
col.a = a * one_over_255;

pbrubaker avatar Jul 17 '22 12:07 pbrubaker

Another example. See the division by toPeak_len

https://ispc.godbolt.org/z/391Ghx81T

pbrubaker avatar Jul 18 '22 05:07 pbrubaker

These seem to be doing the right thing when you add --opt=fast-math

https://godbolt.org/z/o6MY1ajnE https://ispc.godbolt.org/z/o4ePjYqxG

JeffRous avatar Mar 07 '23 23:03 JeffRous

It seems that in this case ISPC has the same behavior as Clang: https://godbolt.org/z/o74oo1eE1

aneshlya avatar Jun 29 '23 20:06 aneshlya

I would say this is not a compiler issue. This is how FP precision rules are defined.

a / const1 = > a * const2 (where const2 = 1 / const1) - this is reciprocal transformation and it's not allowed under default FP precision model (i.e. "precise" model). But this is allowed under "fast" FP model enabled by --opt=fast-math. Same rules apply to C/C++ (with the difference of compiler switches names).

a * 1.f / 255.f is treated as (a * 1.f) / 255.f, so it's simplified to a / 255.f.

If expression is written as a * (1.f / 255.f) then it's simplified to a * const1 where const1 = 1.f / 255.f. This is a allowed under "precise" FP model, as it's just a constant folding.

I suggest closing the issue.

dbabokin avatar Jul 20 '23 00:07 dbabokin