dace icon indicating copy to clipboard operation
dace copied to clipboard

Reduce node auto gpu expansion

Open hodelcl opened this issue 3 years ago • 1 comments

This PR includes two things:

  1. function get_reduction_schedule
  2. auto expansion of the reduce node

This PR adds the auto expansion to the set of available reduce node expansions. The auto expansion is aimed at GPUs. Its goal is to compute an optimal mapping of the reduction input to GPU threads and expands the reduce node according to the computed schedule.

The schedule gets computed via the newly added function get_reduction_schedule, which takes as an input the dimension of the input tensor and a list of the axes to reduce and returns a Python dictionary schedule, which specifies the number of blocks and threads per block to use, etc. At the highest level, the schedule depends on whether contiguous data or non-contiguous data has to be reduced.

The actual expansion called ExpandReduceGPUAuto then calls get_reduction_schedule and creates the actual SDFG expanding the reduce node according to the schedule returned by get_reduction_schedule.

If the auto expansion cannot handle a specific input, it simply falls back to the pure expansion ExpandReducePure.

hodelcl avatar Aug 18 '22 08:08 hodelcl

Exciting! Can't wait to try this

orausch avatar Aug 18 '22 11:08 orausch