OneHotArrays.jl icon indicating copy to clipboard operation
OneHotArrays.jl copied to clipboard

Performance with views

Open mcabbott opened this issue 2 years ago • 2 comments

I was hit by the following performance bug, when using this package and MLUtils:

julia> let 
       x, _ = Flux.splitobs(Flux.onehotbatch(rand(1:99, 100), 1:100); at=1.0, shuffle=false)
       @show summary(x)
       emb = Flux.Embedding(100 => 100)
       a = @btime $emb($x)  # very slow fallback matmul
       println("OneHotMatrix")
       b = @btime $emb(parent($x))  # indexing
       x32 = x .+ 0f0
       @show summary(x32)
       c = @btime $emb.weight * $x32  # BLAS
       end;
summary(x) = "100×100 view(OneHotMatrix(::Vector{UInt32}), :, 1:100) with eltype Bool"
  min 590.041 μs, mean 659.717 μs (7 allocations, 62.50 KiB)
OneHotMatrix
  min 2.953 μs, mean 7.642 μs (2 allocations, 39.11 KiB)
summary(x32) = "100×100 Matrix{Float32}"
  min 6.583 μs, mean 10.608 μs (2 allocations, 39.11 KiB)

One way around this would be to include such things in OneHotLike. Another would be to simply turn views into copies, which is what happens if you reverse the order:

julia> let
       tmp, _ = Flux.splitobs(rand(1:99, 100); at=1.0, shuffle= false)
       x = Flux.onehotbatch(tmp, 1:100)
       @show summary(x)
       emb = Flux.Embedding(100 => 100)
       @btime $emb($x)
       end;
summary(x) = "100×100 OneHotMatrix(::Vector{UInt32}) with eltype Bool"
  min 2.970 μs, mean 7.479 μs (2 allocations, 39.11 KiB)

More immediately, MLUtils.splitobs could also do what it says it does, and call getobs:

help?> Flux.splitobs
  splitobs(data; at, shuffle=false) -> Tuple

  Split the data into multiple subsets proportional to the value(s) of at.

  If shuffle=true, randomly permute the observations before splitting.

  Supports any datatype implementing the numobs and getobs interfaces.
[...]

julia> Flux.getobs(ones(1,5), 1:2)  # what it says it does
1×2 Matrix{Float64}:
 1.0  1.0

julia> Flux.obsview(ones(1,5), 1:2)  # what it actually uses
1×2 view(::Matrix{Float64}, :, 1:2) with eltype Float64:
 1.0  1.0

mcabbott avatar Aug 26 '23 20:08 mcabbott

I believe the view-taking behaviour of splitobs predates MLUtils itself? Might be worth having a discussion about its semantics there.

ToucheSir avatar Aug 28 '23 14:08 ToucheSir

Another workaround is to call getobs:

julia> traindata, valdata = Flux.splitobs(Flux.onehotbatch(rand(1:3, 10), 1:3); at=0.8, shuffle=false);

julia> traindata
3×8 view(OneHotMatrix(::Vector{UInt32}), :, 1:8) with eltype Bool:
 ⋅  ⋅  1  ⋅  1  ⋅  ⋅  ⋅
 1  1  ⋅  1  ⋅  ⋅  1  ⋅
 ⋅  ⋅  ⋅  ⋅  ⋅  1  ⋅  1

julia> getobs(traindata)
3×8 OneHotMatrix(::Vector{UInt32}) with eltype Bool:
 ⋅  ⋅  1  ⋅  1  ⋅  ⋅  ⋅
 1  1  ⋅  1  ⋅  ⋅  1  ⋅
 ⋅  ⋅  ⋅  ⋅  ⋅  1  ⋅  1

Also notice that when a dataset output of splitobs is given to the DataLoader, getobs will be called on the generated mini-batches, therefore the problem won't appear in typical deep learning workflows.

CarloLucibello avatar Jan 25 '25 15:01 CarloLucibello