Paul Fultz II comments

Results 395 comments of


                                            Paul Fultz II

Add optional fp16 rmsnorm conversion pass to fix fp16 accuracy

> since other matcher can depend on that specific structure, so they will need to be extended as well probably. This same rewriting is needed for layernorm, Right now we...

Add optional fp16 rmsnorm conversion pass to fix fp16 accuracy

> Also, this math rewrite only a partial solution, since larger numbers still can cause problem. I would suggest having a dedicated rmsnorm fp16-to-fp32 convert optional pass for those cases....

Remove layernorm fusion

@kahmed10 Any feedback on this?

unregistered operation 'migraphx.max' found in dialect ('migraphx')

We probably need to add this operator to mlir, but it still is probably good to check for supported pointwise before fusing attention.

unregistered operation 'migraphx.max' found in dialect ('migraphx')

Its a change that needs to happen in mlir, but it should be simple to add.

Fuse reductions with MLIR with multi-outputs

The goal of this is to fuse layernorm with with two convs/gemms. In #3010, it will fuse whats after the reductions with the following convolution, but we still need to...

Fuse reductions with MLIR with multi-outputs

> 1. Fuse reshapes first. convolution + reshapes + pointwise --> Always enable this, use https://github.com/ROCm/AMDMIGraphX/pull/3010 to use fuse module functionality. This address issue #2822.

Improve horizontal fusion with multi-used splits

Related #3844, #3920

Support non-topologically sorted graphs

We use the order in the onnx files to ensure that the order is consistent with the onnx file: https://github.com/ROCm/AMDMIGraphX/pull/479 For invalid onnx file, we can write a python script...

BF16 fused_reduce compile fail

What is the compiler error? I dont see any error.