ddc icon indicating copy to clipboard operation
ddc copied to clipboard

Splines limitation: buffer of same size as input data

Open blegouix opened this issue 2 years ago • 3 comments

In Splines branch, spline_builder_batched.hpp file, the object spline_tr is a buffer with the same size as spline (corresponding to the input/output). It implies that half memory has to be filled with buffer (at least).

Improving it is theorically feasible but may require to change the classes APIs (because we currently have solve_inplace method which acts on the buffer and we may not want to alter input, or at least let the user choose).

layout

blegouix avatar Oct 05 '23 10:10 blegouix

Please identify in the set of dimensions {x, y, z} the dimension of interest and the dimensions of batching.

tpadioleau avatar Oct 05 '23 19:10 tpadioleau

Ginkgo or Lapack impose their layouts. Imo, the first rationnal thing to do would be bypassing the transpose if the dimension of interest is the "leftest" (contiguous for Lapack, coalescent for Ginkgo).

Going further would require handling layout stride internally in the matrix classes, I think this is quite immediate with Kokkos-kernel backend but less easy with Ginkgo (slice according to the layout). In both cases this is not clear if it would be beneficial for performance.

I wait for the Lapack backend to be available before working on it.

blegouix avatar Apr 02 '24 15:04 blegouix

Atm comparison of performance with KK backend on strategies (transpose to LayoutRight + coalescent solve vs no transpose + not necessary coalescent solve) has not been performed.

The memory footprint is probably not the main topic here.

blegouix avatar Jul 10 '24 10:07 blegouix