Giuseppe Rossini
Giuseppe Rossini
### Problem Description Given this program: ``` # test.py p = migraphx.program() m = p.get_main_module() p_a = m.add_parameter("inputA",migraphx.shape(type="half_type", lens=[2,1024,1280])) p_b = m.add_parameter("inputB",migraphx.shape(type="half_type", lens=[2,1280,10240])) p_c = m.add_parameter("inputC",migraphx.shape(type="half_type", lens=[2,1024,10240])) p_dot = m.add_instruction(migraphx.op("dot"),...
Hi all, I am trying to use the PDL patterns to double tile a generic matmul-like operation: ``` def add_matmul_schedule(module): dimM, dimN, dimK = [0, 0], [1, 1], [0, 1]...
Hi all, While I finally have some satisfying numbers with my specific 2048^3 experiment (I will push the latest transforms as soon as possible), I thought it was time to...
Hi @nicolasvasilache , all, Before I adventure in writing a pass, I was wondering if you guys already thought about how transposition is handled in the code. ### How (I...
This stemmed from the discussion here: https://github.com/google/iree-llvm-sandbox/pull/83#discussion_r763689396 Basically, the situation I often find myself in is that I manually apply some transformation to the MLIR textual code produced by the...
The iteration in the Epilogue of the pipelined loop was starting from the upper bound of the **non-pipelined** loop. if the original loop was: ``` for k = 0:K: %stage0[k]...
This is my first PR in Triton, and it is trying to fix the limitation on `dot` to support sizes bigger than `(M,N,K)==(16,16,16)`. I modified `semantic.py` to relax the `tt.dot`...
This PR is building on top of https://github.com/triton-lang/triton/pull/4638 to finally add support for buffer operations. For now we will focus on buffer load/store, but in the future we might add...
In this PR I am trying to refactor the specializations that we apply to the signature of a given function in Triton. Basically, given a kernel there are some argument...