ZackyLake issues

Results 3 issues of


                                            ZackyLake

gpu: conv: jit: add early slm check to skip ngen jitting

# Description Convolution's jit kernel could fail due to SLM usage exceeding limit: https://github.com/uxlfoundation/oneDNN/blob/05b4b09de057482a0b324cf8938476400418905b/src/gpu/intel/jit/codegen/codegen.cpp#L1752-L1758 The common usage of SLM in conv is for reduce of k slicing: https://github.com/uxlfoundation/oneDNN/blob/05b4b09de057482a0b324cf8938476400418905b/src/gpu/intel/conv/jit/ir_builder.cpp#L513-L517 And this...

platform:gpu-intel

[GPU] Use zerocopy in load_weights when it's backed by buffer in RAM

In cldnn::data::load_weights, it's using 2MB trunk for read and uploading. It's swappinng to avoid stalls, which is good. But when the source data is actually fully in memory(no matter if...

category: Core

category: GPU

ExternalIntelPR

Add eager release for readvalue and kvcache

### Details: A proposal to eagerly release the output memory for kvcache related primitive, avoiding keeping unnecessary memory/tensor during idle state. Currently readvalue primitive is keeping a reference to "past...

category: GPU

ExternalIntelPR