RecursiveArrayTools.jl icon indicating copy to clipboard operation
RecursiveArrayTools.jl copied to clipboard

access by UnitRange not implemented/performance issue with .x

Open daviehh opened this issue 6 years ago • 1 comments

Access linearly works, but accessing via range does not:

u0=ArrayPartition(rand(1,2),rand(1,3))
u0[1:4]

DimensionMismatch("output array is the wrong size; expected (Base.OneTo(4),), got (5,)")

Although it can be avoided by something like u0[:][1:4]

Somewhat related, there seems to be quite a noticeable performance hit when accessing the solution's .x tuple. e.g.

using DifferentialEquations,RecursiveArrayTools


function eqn_flat(du,u,p,t)
    du .= 0.3 .* u
end
    
function eqn(du,u,p,t)
    for i in 1:RecursiveArrayTools.npartitions(u)
        du.x[i] .= 0.3 .* u.x[i]
    end
end

u0 = ArrayPartition([rand(2,2) for i in 1:20]...)
u0f = rand(2,2,20)

tspan=(0.0,30.0)

prob = ODEProblem(eqn,u0,tspan)
prob_f = ODEProblem(eqn_flat,u0f,tspan)
sol = solve(prob,Tsit5());
sol_f = solve(prob_f,Tsit5());

if one then wants to do further processing on the, e.g. first 10 2-by-2 matrices solution,

function process(sol)
    f=[sol(tx)[:,:,j] for tx in 0:0.01:30, j in 1:10]
end

function process_x(sol)
    f=[sol(tx).x[j] for tx in 0:0.01:30, j in 1:10]
end

function process_x_all(sol)
    f=[reshape(sol(tx)[:][1:40],2,2,10) for tx in 0:0.01:30]
end

using BenchmarkTools

@btime process(sol_f);

67.081 ms (1050096 allocations: 39.59 MiB) (julia default matrices method)

@btime process_x(sol);

254.091 ms (1649858 allocations: 84.89 MiB) (access via .x tuple)

if we use : and change 2d comprehension to 1d, using process_x_all

@btime process_x_all(sol);

26.270 ms (182996 allocations: 12.20 MiB)

@btime process_x_all(sol_f);

6.594 ms (111015 allocations: 7.16 MiB)

of course, with the flat solution we do not need [:]

function process_x_all_flat(sol)
    f=[reshape(sol(tx)[1:40],2,2,10) for tx in 0:0.01:30]
end

@btime process_x_all_flat(sol_f);

6.435 ms (108014 allocations: 5.06 MiB)

either way, looks like there's a 3~4x slowdown when accessing ArrayPartition.

daviehh avatar Feb 14 '19 22:02 daviehh

Using the ArrayPartition is only optimized with broadcast and when you do operations along the tuple. Tsit5 doesn't do all of that yet in its interpolation because of the Base broadcast performance regression.

ChrisRackauckas avatar Feb 14 '19 23:02 ChrisRackauckas