libsamplerate Undefined behavior due to shifts on signed integers

As found in https://github.com/erikd/libsamplerate/issues/65#issuecomment-516058420 using bit shifts on signed integers is undefined behavior. Reason for that is the difference between a "regular shift" (shr) and "arithmetic shift" (sar) where the former just shifts bits while the latter preserves the sign bit in all cases. Also there is a rounding difference for positive and negative numbers.

Examples:

-1 = b11111111
-1 shr 1 = b01111111 = 127
-1 sar 1 = b11111111 = -1

-10 sar 2 = -3
10 sar 2 = 2

This may not be a problem in practice as usually int >> value is translated to sar but I found 1 compiler where it is translated to shr: https://godbolt.org/z/sj_uO6

Not sure if/how to solve this as i >> value is meant as floor(i/2^value) while (int)(i/2^value) is trunc(i/2^value) and hence cannot be "optimized" to a shift.

Jul 30 '19 09:07 Flamefire

Maybe something like this: https://github.com/erikd/libsndfile/blob/master/src/common.h#L1090-L1112

Aug 05 '19 03:08 erikd

The left shift will run into UB if the range of an int is reached. Interestingly this UB makes it produce the expected SAL O.o

The right shift is... interesting. It relies on 2s-complement or at least that the highest bit is 1 for negative numbers. I haven't found a clear definition what happens for negation of negative numbers as it isn't defined that 2s-complement is used. In C++20 left and right shifts are fully defined though: https://en.cppreference.com/w/cpp/language/operator_arithmetic

I don't have a solution. Just suggestion: In the linked functions make the shift value unsigned. A negative shift is UB. And maybe add an assertion on x in arith_shift_left that INT_MIN>>shift <=x <= INT_MAX >> shift (or so)

Aug 05 '19 08:08 Flamefire

Why would you add an assertion? Aren't all the shifts constant values?

Casting from signed to unsigned, performing the shift and casting back should keep the C compiler happy. Doing this in an inline function (if the shifts are constant, they can be embedded in the actual function) seems like the solution.

Aug 05 '19 09:08 erikd

Why would you add an assertion? Aren't all the shifts constant values?

The assertion should be on the x not the shift. E.g. x = (INT_MAX >> 15) passed to shiftLeft(x, 16) is UB (until C++20). So the assert should catch such overflows which I assume are unintended anyway.

Aug 05 '19 10:08 Flamefire

I tired using the undefined behavior sanitizer on x86_64 and found no issues related to shifts at all. Eg

CFLAGS="-fsanitize=undefined -g" CXXFLAGS=${CFLAGS} ./configure
make clean all check

Aug 06 '19 07:08 erikd

UBSAN not erroring out is not a guarantee that there is no UB.

The code uses e.g. lrint(scaled_value) >> 16 where scaled value can be negative and that is UB as the standard states:

For negative a, the behavior of a << b is undefined [...] For negative a, the value of a >> b is implementation-defined

Above I linked 1 compiler (I admit a rather special one) that does not use SAR but SHR which will lead to wrong results.

Aug 06 '19 08:08 Flamefire

UBSAN not erroring out is not a guarantee that there is no UB.

In this case it actually is, but whatever.

The potential for undefined behavior at that that has still not been demonstrated to be the cause of the problem seen on PowerPC. The obvious way to determine whether this undefined behavior is the cause the PowerPC issue is to replace >> 16 with / 65536.

Aug 06 '19 12:08 erikd

I double checked and you are right: No shift is UB as the only problematic shift is in int_to_fp which is called with filter->coeff_half_len. As this can never be negative I'd define it as unsigned instead of int to reflect it.

However it does run into implementation defined behavior due to the right shift of a signed long in src_float_to_short_array, the uses of fp_to_int should not run into that.

The cause for the PowerPC problem is not the shift but a buggy lrintf which does not handle large values correctly. This issue is unrelated but was found during investigation of possible causes for the other bug.

Aug 06 '19 12:08 Flamefire

I 100% believe that PowerPC (or more likley the compiler) has a problem but you have yet to provide any real evidence that UB is happening here. You need to provide that evidence.

Aug 06 '19 21:08 erikd

As written there is no UB, but only implementation defined behaviour which is in src_float_to_short_array:

a negative float is passed to src_float_to_short_array (e.g. in the tests)
that float is scaled by a positive factor so it stays negative
that is converted to a signed long via lrint
this negative signed long is right-shifted by 16 -> implementation defined result

Solution: Use code like this:

void
src_float_to_short_array (const float *in, short *out, int len)
{	double scaled_value ;
	while (len)
	{	len -- ;
		scaled_value = in [len] * 32768 ;
		if (scaled_value >= 32767.f)
		    out [len] = 32767 ;
		else if (scaled_value <= -32768.f)
		    out [len] = -32768 ;
                else
		    out [len] = (short) (lrint (scaled_value)) ;
	} ;
}

Aug 07 '19 06:08 Flamefire

I will ask yet again, where is the evidence that this is UB?

Did you even try my suggestion of:

The obvious way to determine whether this undefined behavior is the cause the PowerPC issue is to replace >> 16 with / 65536.

Aug 07 '19 07:08 erikd

I will ask yet again, where is the evidence that this is UB?

I don't understand the question. With the steps described above a right shift of a negative signed integer is performed. Do you agree with that?
The C++ standard (which I quoted above) and the C standard define the result of a right shift of a negative signed integer as "implementation defined".
Hence it is shown that implementation defined behavior is invoked.

What else do you need?

Did you even try my suggestion of:

No, why? I don't even have a buggy PowerPC platform and the shift is unrelated to the bug reported in #65.

The current implementation works because a right shift of a signed int is "usually" defined (by the compiler in use!) to be an arithmetic shift, but a compiler on another platform may not do that (which is the meaning of "implementation defined"), see also https://stackoverflow.com/a/7636/1930508 or Wikipedia:

The (1999) ISO standard for the programming language C defines the right shift operator in terms of divisions by powers of 2.[8] Because of the above-stated non-equivalence, the standard explicitly excludes from that definition the right shifts of signed numbers that have negative values. It does not specify the behaviour of the right shift operator in such circumstances, but instead requires each individual C compiler to define the behaviour of shifting negative values right

Aug 07 '19 07:08 Flamefire

I am very hesitant to change code without evidence that the change actually fixes something.

Aug 07 '19 07:08 erikd

Sure: My proposed code does not rely on implementation defined behavior (fixes this issue) and additionally does not run into the bug @janstary has with his machines lrint implementation due to keeping the numbers low. This is proven by him at https://github.com/erikd/libsamplerate/issues/65#issuecomment-516330798 :

That makes the float-to-short test pass for me

Aug 07 '19 07:08 Flamefire

UBSAN not erroring out is not a guarantee that there is no UB.

In this case it actually is, but whatever.

The potential for undefined behavior at that that has still not been demonstrated to be the cause of the problem seen on PowerPC. The obvious way to determine whether this undefined behavior is the cause the PowerPC issue is to replace >> 16 with / 65536.

With that change (to the current git), the test fails with

tests/float_short_test

        float_to_short_test ............................. 

        Line 64 : out [1] == 0

Aug 09 '19 15:08 janstary

That is expected:

When integers are divided, the result of the / operator is the algebraic quotient with any fractional part discarded. [...] This is often called ‘‘truncation toward zero’’.

So: -1 >> n == -1 for all n but -1 / n == 0 for n>1

Aug 09 '19 15:08 Flamefire