[PoC, do not merge] src: gpu: intel: add user-supplied precomputed zps to gemm
This is a proof-of-concept implementation of user ZPs in GEMM — ref-only as of now, and with no foolproofing.
The RFC for this change is here.
Hi @hidefromkgb. I think we should maybe spend a little more time on external API definition as tying precomputed columns reduction on SRC as a separate zero-point seems counter-intuitive:
- The need for precomputed SRC column reduction is specific to matmul /convolution compensation. For other primitives where there is no dot-product like semantic, compensation would take a very different form.
- The shape of precomputed SRC col reduction is determined by WEIGHTS zero point shape as well as primitive to run (e.g. matmul or convolution). So it would seem to make sense to just infer this compensation shape from weights zp, rather than take new masks for this parameter.
Given the above, I wonder if column reduction of SRC belongs with zero-point and scales attributes. I wonder if it would be simpler to expose those as flag or extra parameter for matmul. From user perspective:
- they would create matmul with
skip_src_column_reduction(tentative name :)). - they query md for
DNNL_ARG_SRC_COL_REDUCTIONfrom primitive_desc - they create memory object with that md and their pointer
With the above, the content and shape of the argument is clear since it is tied to matmul semantic with weights zero-point.
Another approach would be a new attribute (not tied to zero-point), but again attributes are often applicable to multiple primitives, not just matmul, and given how the nature of this one is tied to dot-product, it might be better suited to make it part of pd constructor.
make test set test_scope=NIGHTLY disable test_device_cpu disable benchdnn_all enable benchdnn_matmul enable benchdnn_ip enable arch_gpu_xe-hpg-atsm enable arch_gpu_xe-hpg-dg2 enable arch_gpu_xe-lp enable arch_gpu_xe-lpg enable arch_gpu_xe-lpg+ enable arch_gpu_xe2-hpg-bmg enable arch_gpu_xe2-lpg enable arch_gpu_xe3-lpg
make test perf-gpu set primitive=matmul ip
Superseded by #3750. Closing.