oneDNN icon indicating copy to clipboard operation
oneDNN copied to clipboard

[PoC, do not merge] src: gpu: intel: add user-supplied precomputed zps to gemm

Open hidefromkgb opened this issue 9 months ago • 2 comments

This is a proof-of-concept implementation of user ZPs in GEMM — ref-only as of now, and with no foolproofing.

The RFC for this change is here.

hidefromkgb avatar May 06 '25 22:05 hidefromkgb

Hi @hidefromkgb. I think we should maybe spend a little more time on external API definition as tying precomputed columns reduction on SRC as a separate zero-point seems counter-intuitive:

  • The need for precomputed SRC column reduction is specific to matmul /convolution compensation. For other primitives where there is no dot-product like semantic, compensation would take a very different form.
  • The shape of precomputed SRC col reduction is determined by WEIGHTS zero point shape as well as primitive to run (e.g. matmul or convolution). So it would seem to make sense to just infer this compensation shape from weights zp, rather than take new masks for this parameter.

Given the above, I wonder if column reduction of SRC belongs with zero-point and scales attributes. I wonder if it would be simpler to expose those as flag or extra parameter for matmul. From user perspective:

  • they would create matmul with skip_src_column_reduction (tentative name :)).
  • they query md for DNNL_ARG_SRC_COL_REDUCTION from primitive_desc
  • they create memory object with that md and their pointer

With the above, the content and shape of the argument is clear since it is tied to matmul semantic with weights zero-point.

Another approach would be a new attribute (not tied to zero-point), but again attributes are often applicable to multiple primitives, not just matmul, and given how the nature of this one is tied to dot-product, it might be better suited to make it part of pd constructor.

mgouicem avatar Jun 26 '25 11:06 mgouicem

make test set test_scope=NIGHTLY disable test_device_cpu disable benchdnn_all enable benchdnn_matmul enable benchdnn_ip enable arch_gpu_xe-hpg-atsm enable arch_gpu_xe-hpg-dg2 enable arch_gpu_xe-lp enable arch_gpu_xe-lpg enable arch_gpu_xe-lpg+ enable arch_gpu_xe2-hpg-bmg enable arch_gpu_xe2-lpg enable arch_gpu_xe3-lpg

hidefromkgb avatar Jun 27 '25 02:06 hidefromkgb

make test perf-gpu set primitive=matmul ip

hidefromkgb avatar Jun 27 '25 02:06 hidefromkgb

Superseded by #3750. Closing.

hidefromkgb avatar Aug 20 '25 23:08 hidefromkgb