oneDNN icon indicating copy to clipboard operation
oneDNN copied to clipboard

[GPU]Batched GEMM scale support

Open kealan-barbieri opened this issue 8 months ago • 3 comments

Description

Enable scaling for batched gemms that are not reshaped down to 2d.

Fixes # MFDNN-13705

Checklist

General

  • [ ] Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
  • [x] Have you formatted the code using clang-format?

kealan-barbieri avatar Jun 02 '25 23:06 kealan-barbieri

@kealan-barbieri -- I didn't see an implementation for {a,b}scPtrDims == 3 (which I guess is the case we want here), is there a missing commit?

petercad avatar Jun 03 '25 05:06 petercad

@petercad The required cases for MFDNN-13705 so far effectively dont use 3d ptr dims, they can all be handled with conversion to post-ops and existing binary batch offset handling. Will add a follow up commit to handle true 3d scales for cases with int4 weights and nontrivial group.

kealan-barbieri avatar Jun 03 '25 23:06 kealan-barbieri

@kealan-barbieri OK, so if I understand correctly the generator-side changes are not necessary for this commit, but are preparing for your next commit with true 3D scale support.

petercad avatar Jun 03 '25 23:06 petercad

make test set test_scope=NIGHTLY disable test_device_cpu disable benchdnn_all enable benchdnn_matmul

kealan-barbieri avatar Jun 18 '25 23:06 kealan-barbieri

make test perf-gpu set primitive=matmul

kealan-barbieri avatar Jun 18 '25 23:06 kealan-barbieri