bopeng1234

Results 7 issues of bopeng1234

Hi, we are running benchmark on 3b models, using all-in-one script, models are [phi3-4k](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct), [phi3-128k](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) and [starcoder2-3b](https://huggingface.co/bigcode/starcoder2-3b). Environments: ThinkBookX2024 Ultra9-185H Windows11pro-23H2 32GB LPDDR5x8400MHz Arc Driver: 31.0.101.5445 ipex-llm[xpu] 2.1.0b20240506 Questions here:...

user issue

In Ubuntu 22.04, i5-1135G7 with iGPU enabled (oneAPI+compute runtime+level zero loader) kernel 6.5.0-35-generic (follow this [link](https://dgpu-docs.intel.com/driver/client/overview.html#install-out-of-tree-driver) to install) Run the `oneapi matrix_mult` sample in one terminal ``` oneAPI-samples/DirectProgramming/C++SYCL/DenseLinearAlgebra/matrix_mul$ ./matrix_mul_dpc Device:...

follow current implementation for ScaledDotProductAttentionDecomposition, https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_npu/src/plugin/npuw/llm_compiled_model.cpp#L221 add GroupQueryAttention OP decomposition logic for NPUW llm_compiled_model.

ExternalIntelPR
category: NPU
category: NPUW

add basic implementation for RotaryEmbedding op, https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#com.microsoft.RotaryEmbedding

category: docs
ExternalIntelPR
category: ONNX FE

current implementation will failed when stash_type not match X's element type, for example stash type is fp32, X is fp16

ExternalIntelPR
category: ONNX FE

# Existing Sample Changes ## Description The [UV tool](https://github.com/astral-sh/uv) is designed to streamline the management of Python environments for multiple test cases. One of its standout features is its ability...

Hackathon

Move the ConvertWeightCompressedConv1x1ToMatmul pattern and its test from the intel_gpu plugin to the common transformation folder. The purpose is to reuse it on both the CPU and GPU sides. ###...

category: GPU
category: transformations