bopeng1234 issues

Results 7 issues of


                                            bopeng1234

Phi-3 model performance on MeteorLake GPU

Hi, we are running benchmark on 3b models, using all-in-one script, models are [phi3-4k](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct), [phi3-128k](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) and [starcoder2-3b](https://huggingface.co/bigcode/starcoder2-3b). Environments: ThinkBookX2024 Ultra9-185H Windows11pro-23H2 32GB LPDDR5x8400MHz Arc Driver: 31.0.101.5445 ipex-llm[xpu] 2.1.0b20240506 Questions here:...

user issue

try tool sysmon, not process detected.

In Ubuntu 22.04, i5-1135G7 with iGPU enabled (oneAPI+compute runtime+level zero loader) kernel 6.5.0-35-generic (follow this [link](https://dgpu-docs.intel.com/driver/client/overview.html#install-out-of-tree-driver) to install) Run the `oneapi matrix_mult` sample in one terminal ``` oneAPI-samples/DirectProgramming/C++SYCL/DenseLinearAlgebra/matrix_mul$ ./matrix_mul_dpc Device:...

[NPUW] add GroupQueryAttention OP in npuw llm_compiled_model

follow current implementation for ScaledDotProductAttentionDecomposition, https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_npu/src/plugin/npuw/llm_compiled_model.cpp#L221 add GroupQueryAttention OP decomposition logic for NPUW llm_compiled_model.

ExternalIntelPR

category: NPU

category: NPUW

[ONNX] add com.microsoft.RotaryEmbedding op to onnx frontend

add basic implementation for RotaryEmbedding op, https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#com.microsoft.RotaryEmbedding

category: docs

ExternalIntelPR

category: ONNX FE

[ONNX] fix onnx frontend simplified_layer_normalization bug

current implementation will failed when stash_type not match X's element type, for example stash type is fp32, X is fp16

ExternalIntelPR

category: ONNX FE

Use uv tool for isolate cicd env

# Existing Sample Changes ## Description The [UV tool](https://github.com/astral-sh/uv) is designed to streamline the management of Python environments for multiple test cases. One of its standout features is its ability...

Hackathon

[Transformation] move the place of ConvertWeightCompressedConv1x1ToMatmul pattern

Move the ConvertWeightCompressedConv1x1ToMatmul pattern and its test from the intel_gpu plugin to the common transformation folder. The purpose is to reuse it on both the CPU and GPU sides. ###...

category: GPU

category: transformations