oneDNN
oneDNN copied to clipboard
[POC] xe: jit: gemm: nf4 weights decompression
POC of nf4 weights decompression for Intel GPUs (MFDNN-13636), to allow OpenVINO to test it out.
Adds a new nf4 data type (may not be final design -- just for enabling), and optimized weights decompression kernels for basic LLM-like sizes on XeHPG/Xe2.