oneDNN icon indicating copy to clipboard operation
oneDNN copied to clipboard

[POC] xe: jit: gemm: nf4 weights decompression

Open petercad opened this issue 8 months ago • 0 comments

POC of nf4 weights decompression for Intel GPUs (MFDNN-13636), to allow OpenVINO to test it out.

Adds a new nf4 data type (may not be final design -- just for enabling), and optimized weights decompression kernels for basic LLM-like sizes on XeHPG/Xe2.

petercad avatar May 17 '25 22:05 petercad