StaticArrays.jl
StaticArrays.jl copied to clipboard
Specialize `triu`/`tril` for `StaticMatrix`
This provides a performance boost for small matrices: On master
julia> S = SMatrix{3,3}(1:9)
3×3 SMatrix{3, 3, Int64, 9} with indices SOneTo(3)×SOneTo(3):
1 4 7
2 5 8
3 6 9
julia> @btime triu($S)
245.797 ns (1 allocation: 80 bytes)
3×3 MMatrix{3, 3, Int64, 9} with indices SOneTo(3)×SOneTo(3):
1 4 7
0 5 8
0 0 9
This PR
julia> @btime triu($S)
17.183 ns (0 allocations: 0 bytes)
3×3 SMatrix{3, 3, Int64, 9} with indices SOneTo(3)×SOneTo(3):
1 4 7
0 5 8
0 0 9
I've hard-coded the maximum length of 32 until which working on Tuples directly is faster, but it'll probably be better to read this limit from Base. Any suggestions on how to approach this?