TensorComprehensions icon indicating copy to clipboard operation
TensorComprehensions copied to clipboard

[wip] Grid synchronizations and mapping to blocks.

Open math-fehr opened this issue 7 years ago • 1 comments

This is a wip allowing grid synchronizations to be produced. It also changes the mapping algorithm, to be able to map bands bands to blocks even if there is no outermost coincident band.

This wip is based on #316, so it includes warp synchronizations.

Instead of finding the outermost band at the base of the tree, the mapping will find all the outermost coincident bands (bands that have a coincident dimension, and that has no ancestor band with a coincident dimension) in the tree, and place some zero-dimension band if necessary. It will then map the threads below the selected bands, and the blocks above the bands. It will also insert grid synchronizations where it is necessary to ensure the correctness of the compiled kernel.

To be able to use the grid synchronization, the kernel should be launched with a cooperative launch. This requires the co-residency of all blocks and threads. This means that the number of threads and blocks should be low enough. This is checked before the mapping is done, so the new mapping algorithm is only used when possible. Also, grid synchronization is only available with CUDA 9, but this isn't currently checked. Grid synchronization is only available when --grid_sync option is on (it is on by default).

Grid synchronizations seem to find a better mapping than the original algorithm for the 4fcrelu kernel. However, I didn't tried it with many kernels, so I don't know if there is other kernels that can have better mapping thanks to grid synchronizations.

There is still some improvement to do, such as reducing the number of grid synchronizations inserted, or having a better shared memory promotion (the shared memory promotion is called multiple times on distinct schedule tree, which maximize the promotion in the firsts schedule trees, instead of promoting all the trees at the same time).

math-fehr avatar Apr 27 '18 14:04 math-fehr

@nicolasvasilache as a reminder, github does not show which commit you reviewed so "this commit should add" is not helpful.

ftynse avatar Jun 06 '18 08:06 ftynse