LoopVectorization.jl icon indicating copy to clipboard operation
LoopVectorization.jl copied to clipboard

Cannot recognize index `i` in `@turbo` over `Vector{DateTime}`

Open hayesall opened this issue 4 years ago • 3 comments

Overview

I get this error:

ERROR: UndefVarError: i not defined
Stacktrace:
 [1] scale_stamps_turbo(data::Vector{DateTime})
   @ Main ./REPL[7]:5
 [2] top-level scope
   @ REPL[15]:1

When running:

using Dates
using LoopVectorization

function scale_stamps_turbo(data::Vector{Dates.DateTime})
    out = similar(data, Float64)
    ϕ = (data[lastindex(data)] - data[1]).value
    @turbo for i ∈ eachindex(data)
        out[i] = (data[i] - data[1]).value / ϕ
    end
    return out
end

When I do not use the @turbo macro, it works correctly:

using Dates

function scale_stamps(data::Vector{Dates.DateTime})
    out = similar(data, Float64)
    ϕ = (data[lastindex(data)] - data[1]).value
    for i ∈ eachindex(data)
        out[i] = (data[i] - data[1]).value / ϕ
    end
    return out
end
julia> hcat(mydata, scale_stamps(mydata))
5×2 Matrix{Any}:
 1990-01-01T00:00:01  0.0
 1990-01-01T00:00:03  0.142857
 1990-01-01T00:00:06  0.357143
 1990-01-01T00:00:10  0.642857
 1990-01-01T00:00:15  1.0

Debugging 1: Correct behavior with Vector{Int64}

It looks like this is related to using Vector{Dates.DateTime}, the following two functions perform the same operation over vectors of Int, but work correctly:

using LoopVectorization

function scale(data::Vector{Int64})
    out = similar(data, Float64)
    ϕ = data[lastindex(data)] - data[1]
    for i ∈ eachindex(data)
        out[i] = (data[i] - data[1]) / ϕ
    end
    return out
end

function scale_turbo(data::Vector{Int64})
    out = similar(data, Float64)
    ϕ = data[lastindex(data)] - data[1]
    @turbo for i ∈ eachindex(data)
        out[i] = (data[i] - data[1]) / ϕ
    end
    return out
end
Sample Output
julia> hcat(somedata, scale(somedata), scale_turbo(somedata))
10×3 Matrix{Float64}:
  1.0  0.0   0.0
  3.0  0.08  0.08
  6.0  0.2   0.2
  8.0  0.28  0.28
 12.0  0.44  0.44
 14.0  0.52  0.52
 17.0  0.64  0.64
 21.0  0.8   0.8
 22.0  0.84  0.84
 26.0  1.0   1.0
Benchmark
julia> @benchmark scale(benchmark_data)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  254.978 μs …   1.789 ms  ┊ GC (min … max): 0.00% … 81.20%
 Time  (median):     255.789 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   308.832 μs ± 141.360 μs  ┊ GC (mean ± σ):  4.30% ±  8.33%

  █▁▂▁         ▆▄ ▁                                             ▁
  █████▆▇▅▅▅▄█▇█████▅▅▄▄▅▄▄▃▃▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ █
  255 μs        Histogram: log(frequency) by time        941 μs <

 Memory estimate: 625.08 KiB, allocs estimate: 2.

julia> @benchmark scale_turbo(benchmark_data)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  59.220 μs … 888.608 μs  ┊ GC (min … max):  0.00% … 73.05%
 Time  (median):     68.954 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   79.751 μs ±  76.075 μs  ┊ GC (mean ± σ):  11.16% ± 10.52%

  █▆▃▁                                                         ▁
  ████▆▅▄▁▁▁▁▁▄█▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ █
  59.2 μs       Histogram: log(frequency) by time       705 μs <

 Memory estimate: 625.08 KiB, allocs estimate: 2.

Helper Methods for Debugging

My inputs are vectors of strictly increasing values, so here are two functions for generating data:

generate Int64 Vectors / DateTime Vectors
using Random

function generate_data(N::Int64)
    data = Vector{Int64}(undef,N)
    v = 0
    for i in 1:N
        v += rand(1:5, 1)[1]
        data[i] = v
    end
    return data
end

function generate_timestamps(N::Int64)
    data = Vector{Dates.DateTime}(undef,N)
    v = DateTime(1990, 1, 1, 0, 0, 0)
    for i in 1:N
        v += Second(i)
        data[i] =v
    end
    return data
end

hayesall avatar Sep 23 '21 16:09 hayesall

Ah, the problem is with my lazy method for supporting getproperty. None of

julia> Union{Bool,Base.HWReal}
Union{Bool, Float32, Float64, Int16, Int32, Int64, Int8, UInt16, UInt32, UInt64, UInt8}

support getproperty, so it assumes that getproperty is being done on an object that can be hoisted out of the loop. If the object cannot be hoisted out of the loop, then it must depend on the loop somehow, i.e. must be loaded from an index. As LV only supports loading/operating on Union{Bool,Base.HWReal}, which don't have getproperty, then moving the expression out of the loop must generally be fine.

So that's what is happening here: (data[i] - data[1]).value gets moved out of the loop. Once this is removed, the loop is now

    getprop =  (data[i] - data[1]).value
    for i ∈ eachindex(data)
        out[i] = getprop / ϕ
    end

LoopVectorization also checks to make sure all arrays are of a valid element type; e.g. DateTime is not:

julia> typeof(ts)
Vector{DateTime} (alias for Array{DateTime, 1})

julia> LoopVectorization.check_args(ts)
false

However, the above loop only has out in it, and out isa Vector{Float64}, so it passes the check. It doesn't notice that data was there.

Hence it does end up running this code instead of a fallback @inbounds @fastmath loop, and you get the error once it evaluates

    getprop =  (data[i] - data[1]).value

The simplest solution is probably to use reinterpret to cast your array:

julia> tsi = reinterpret(Int, ts);

julia> LoopVectorization.check_args(tsi)
true

julia> typeof(tsi)
Base.ReinterpretArray{Int64, 1, DateTime, Vector{DateTime}, false}

That way you can use your data::Vector{DateTime} with scale_turbo, after you loosen the signature to scale_turbo(data::AbstractVector{Int64}).

chriselrod avatar Sep 23 '21 18:09 chriselrod

That worked, really helpful explanation as well. Thank you!

Would you like for something like this to be contributed to the Examples documentation?

hayesall avatar Sep 23 '21 19:09 hayesall

That worked, really helpful explanation as well. Thank you!

Would you like for something like this to be contributed to the Examples documentation?

Sure, that'd be appreciated!

chriselrod avatar Sep 23 '21 19:09 chriselrod