Support DistributedArrays.jl
@mdavezac
I was thinking that it might be easier to get a use-case working for DistributedArrays.jl than it is with GPUArrays.jl. We could do something simple like a finite-difference solve, but where the data is spread between different machines.
I'm not convinced it would be easier. DArray has a special constructor, since each process should only creates it's own local storage. Nominally, we would probably want to think about typical data access patterns and distribute the data accordingly.
Also, I'm not sure whether it would be very efficient without tailoring many of the linalg operations to banded distributed arrays. I don't know that Julia aggregates IPC messages, so each access to out-of-node data will generate a separate message. On large matrices, IO would soon swamp compute.
On the GPU side, I imagine there are concerns about which operations can be done efficiently on the GPU? Is that why you are thinking DArrays?
You are probably right. Though if the columns of the DArray are on different nodes then communication will automatically be minimised.
GPUArrays should be fine, though we’ll have to do PRs to CLBLAS.jl and CLArrays.jl to get gbmv! working.
Another good option is SMatrix backend from StaticArrays.jl.