sparse-structured-attention icon indicating copy to clipboard operation
sparse-structured-attention copied to clipboard

Hi, is there any faster gpu-version?

Open happygds opened this issue 7 years ago • 2 comments

Hi, I find it was too slow when I ran the code, is there any faster gpu-version ?

happygds avatar Jul 09 '18 06:07 happygds

Hi,

We did use the GPU in our experiments. The SparseMAP layer itself cannot run on the GPU because it relies on external C++ code. What I recommend is: (i) Run the first part of your model on GPU. (ii) copy the potentials to CPU. (iii) Run SparseMAP on CPU. (iv) copy back and finish. This worked well for us and-- in the case of ESIM-- with minimal slowdown.

Hope this helps!

vene avatar Jul 09 '18 11:07 vene

@vene Really interesting. I am also wondering whether your sparse-attention support 4D tensors input like a mini-batch of images.

If the code can not run on GPU, it can become super-slow for image processing.......

PkuRainBow avatar Jul 16 '18 06:07 PkuRainBow