Henry Ho
Results
4
issues of
Henry Ho
1. VectorWidthA and VectorWidthB for mfma kernel 2. Wider local read for tileMajorLDS 3. Solve bank conflict caused by VectorWidthA/B 4. Prefetch all localreads for BF16/FP16/INT8 packing to the front...
NoCI
remove data initialization dependency of lda in hipblaslt-bench, so that we benchmark same data when leading dimension is different.
gfx94x