LoopVectorization.jl icon indicating copy to clipboard operation
LoopVectorization.jl copied to clipboard

`@turbo` gives wrong output with `%()` (in some cases) but correct with `mod()`

Open Titas22 opened this issue 11 months ago • 0 comments

Hi,

I've run into a weird issue with some modulo operations in some loops using @turbo.

The loop is essentially this:

@turbo for kk in eachindex(results)
    results[kk] = (results[kk] * bases[kk]) % m
end

It works most of the time (and with most inputs) but not always. Here's a script below with some test cases:

using LoopVectorization
using Test

function test!(f!::Function, results::AbstractVector{UInt32})
    bases = copy(results)
    results .= UInt32(1)
    e, m = 437, 6059

    for ii in 1 : Int(ceil(log(Float64(e)) / log(2)))
        if (e >> (ii - 1)) & 1 == 1
            f!(results, bases, m)
        end

        # For some loop in this loop @turbo works fine, even if it also has mod(x, m)
        @turbo for jj in eachindex(bases)
            v = bases[jj]
            bases[jj] = (v * v) % m
        end
    end

    return results
end
function f1!(results::AbstractVector{UInt32}, bases::AbstractVector{UInt32}, m::Integer)
    @turbo for kk in eachindex(results)
        results[kk] = (results[kk] * bases[kk]) % m
    end
end
function f2!(results::AbstractVector{UInt32}, bases::AbstractVector{UInt32}, m::Integer)
    @turbo for kk in eachindex(results)
        results[kk] *= bases[kk]
    end
    @turbo for kk in eachindex(results)
        results[kk] %= m
    end
end
function f3!(results::AbstractVector{UInt32}, bases::AbstractVector{UInt32}, m::Integer)
    @turbo for kk in eachindex(results)
        results[kk] *= bases[kk]
    end

    # Somehow adding @turbo to this loop makes it give wrong result
    for kk in eachindex(results)
        results[kk] %= m
    end
end
function f4!(results::AbstractVector{UInt32}, bases::AbstractVector{UInt32}, m::Integer)
    @turbo for kk in eachindex(results)
        results[kk] = mod(results[kk] * bases[kk], m) # !!! using mod() instead of % gives correct result
    end
end

@testset "tests" begin
    @test powermod(82, 437, 6059) == 5394 # Validation

    @test Int.(test!(f1!, UInt32.([82]))) == [5394] # <- Wrong
    @test Int.(test!(f2!, UInt32.([82]))) == [5394] # <- Still wrong

    @test Int.(test!(f3!, UInt32.([82]))) == [5394] # This one is correct
    @test Int.(test!(f4!, UInt32.([82]))) == [5394] # !!!Somehow this is also correct

    # Below (different input) is correct for all
    @test powermod(80, 437, 6059) == 3667 # Validation
    @test Int.(test!(f1!, UInt32.([80]))) == [3667]
    @test Int.(test!(f2!, UInt32.([80]))) == [3667]
    @test Int.(test!(f3!, UInt32.([80]))) == [3667]
    @test Int.(test!(f4!, UInt32.([80]))) == [3667]
end
  • In the test! function there's 2 loops doing similar operations (just different inputs), I haven't found an issue with the 2nd loop yet
  • I've separated operations into separate loops to track down (f2! and f3!) - the issue seems to come from modulo operation
  • Wrong result seems to be rare (most inputs are ok) but they do happen
  • Using mod() instead of %() fixes the issue

Results on my laptop below (although I see the same on 2 different machines - different CPUs, both on windows)

tests: Test Failed at c:\Users\titas\Code\AeroMap.jl\WIP\test.jl:66
  Expression: Int.(test!(f1!, UInt32.([82]))) == [5394]
   Evaluated: [2068] == [5394]

tests: Test Failed at c:\Users\titas\Code\AeroMap.jl\WIP\test.jl:67
  Expression: Int.(test!(f2!, UInt32.([82]))) == [5394]
   Evaluated: [2068] == [5394]

Test Summary: | Pass  Fail  Total  Time
tests         |    8     2     10  0.4s
ERROR: Some tests did not pass: 8 passed, 2 failed, 0 errored, 0 broken.

Titas22 avatar Feb 14 '25 00:02 Titas22