loopy icon indicating copy to clipboard operation
loopy copied to clipboard

Implements a Loop Fusion Transformation

Open kaushikcfd opened this issue 4 years ago • 3 comments

Loopy-flavored loop-fusion transformation corresponding to https://doi.org/10.1007/3-540-57659-2_18.

kaushikcfd avatar Oct 04 '21 14:10 kaushikcfd

This could be rebased now that the prerequisite generate_loop_schedule_v2 is in.

inducer avatar Oct 18 '24 15:10 inducer

FYI @kaushikcfd, while I was browsing through this code the other day trying to understand a warning that was being emitted (which turned into inducer/meshmode#453), I spotted a few opportunities to avoid recomputation and speed things up a fair amount in get_kennedy_unweighted_fusion_candidates. Specifically, the calls I noticed that were being repeated were _get_partial_loop_nest_tree_for_fusion, _get_ldg_nodes_from_loopy_insn, and (I think, need to revisit and confirm) get_insn_access_map. If I can find some time this week I'll finish my changes and push them for you take a look at.

majosm avatar Mar 11 '25 15:03 majosm

@majosm: Thanks for the potential bottlenecks. I memoized those routines.

kaushikcfd avatar Mar 12 '25 12:03 kaushikcfd

Pushed some cosmetic fixes. This was complex to review, but I think I've got a decent understanding of it now. LGTM, in it goes!

inducer avatar Jul 10 '25 16:07 inducer