[Bug] Packing broken when not using pinned memory or ATS
When we're not using pinned memory our packing buffers are standard host allocations. They are written to directly from the GPU and this only works on Lassen because of the Address Translation Service which is not present on most GPU machines and is quite slow.
My suggested fix is that we stop using a std::vector and instead use a Array1D. When not using pinned memory this can use the standard array1d< buffer_unit_type >, which will move the memory between spaces as appropriate. For pinned memory we should create a new LvArray buffer type that allocates pinned memory (or more generally uses a specific Umpire allocator). This buffer type would no do any memory movement.
This will require changes to at least the device packing functions to take an array1d instead of a pointer but that should be pretty easy.
During my investigation for the testPacking unit test, the error was (first) revealed in the PackDataDevice for array1d of R1Tensor.
How the R1Tensor is packed is really c++ precise/tricky.
I don't know if this is related, just mentioning.
(I tried to make it simpler just to see with no success on Pangea3 with GPU)
Hi @corbett5 do you know if somebody's working on this? I still cannot pass the unit tests on Pangea3 (despite your previous patch) and I do not how to move forward. 🤷
No one is working on it right now. This should only have an effect when using the command line option to disable pinned memory so I don't think it would fix your problems.
I'll modify #1098 to include removing the pointer-based packing and instead use the array1d type.
@wrtobin @corbett5 any update?