Alexander Efimov
Alexander Efimov
@ptillet I've removed AMD stuff from common cmakes, could you take a look?
@ptillet @ThomasRaoux I've rebased this PR and removed some unused stuff, could you take a look again?
Essentially, this patch enables `triton::FPToFP` operation to cast fp16 to fp32 and back, @joviliast is it correct? @ptillet Do we want `FPToFP` operation to be able to convert any float...
> This patch disables casting. Did you mean "pass"? It disables only "intermediate" casting, the rest is in place: - https://github.com/openai/triton/pull/3091/files#diff-c1b7645c56652fe811ae807f37bbea185af4221b1122fa095fbd43b80266182eR1686 - https://github.com/openai/triton/pull/3091/files#diff-c1b7645c56652fe811ae807f37bbea185af4221b1122fa095fbd43b80266182eR1700
To clarify, what this PR is doing: At the moment we have an optimization in `DecomposeUnsupportedLayouts` pass, which is looking for `convert_layout` operations that requires more shared memory, than we...
@zhanglx13 about `tryMinimizeLDS` Condition is filters out cases which will definitely overflow LDS and there are no early exit. We can actually remove this condition at all, because we are...
> the early return condition needs to be removed Now I see, I've missed **this** early return, thank you! At first I thought you were talking about early exit from...
@antiagainst @zhanglx13 This PR is ready for review, PTAL :slightly_smiling_face:
@zhanglx13 @antiagainst PTAL
> what's the status on this pull request? Do we still need it? I don't think we should focus on this at the moment, because it is not blocking anything...