Jiapeng Zhou issues

Results 9 issues of


                                            Jiapeng Zhou

评论为空时，禁用插件后再添加会出现Database Query Error

![image](https://user-images.githubusercontent.com/45599590/82786733-b577d080-9e97-11ea-8af1-152dffa1bbc9.png) 因为在这里判断数据库中是否存在receiveMail列是通过取一行来判断的，当comments数据库中没有评论数据时会判断为不存在该列。如果之前启用了该插件，数据库中会存在receiveMail列。若禁用并再次启用且无评论数据，那么又会将该列加入数据库中，所以报错

Forget to clear the `_ever_prefetched` flag in the CacheBlk::invalidate() function

The function `clearPrefetched()` doesn't clear the `_ever_prefetched` flag. It seems that here we need a `clearEverPrefetched()` like function. When that is applied, there is a slight change in scores(

CAT/MBA support for Non-CPU agent

Hi, in the released manual "[Intel RDT architecture specification](https://cdrdv2-public.intel.com/789566/356688-intel-rdt-arch-spec.pdf)", I noticed that we can assign a RMID/CLOS tag to a non-cpu agent like PCIE/CXL device. But I did not find...

Questions about SHAVE Data Backup Storage and Cache Coherency

In the 2025.18 release, I noticed that the Act Runtime now supports SHAVE execution directly from DDR, rather than copying data from DDR to CMX via DMA. I'm curious if...

help wanted

question

Enable prefetching of SW kernel instructions after the first SW task

## Summary This PR enhances the AddSwKernelInstructionPrefetchPass to enable prefetching of SHAVE kernel instructions *after* the first SHAVE task, if the initial slack is insufficient. Currently, instruction prefetching is skipped...

READY_FOR_REVIEW

Multi-Tenant Behavior and Resource Sharing on Intel NPU4

Hi, I'm profiling workloads on the Intel NPU4 architecture and have some questions regarding multi-tenant usage. The manual mentions 6 tiles with corresponding CMX. My main concern is how different...

[Question] Unexpectedly low prefill (TTFT) latency ratio

Hello, I'm testing the Qwen2.5-1.5B model with openvino.genai and observing PerfMetrics that seem counter-intuitive. The Time to First Token (TTFT) accounts for a very small fraction of the total Generate...

[Question] Understanding NPU Compilation: Model Chunking, IR Re-use, and Dynamic Shapes

I'm analyzing the NPU inference behavior for a Qwen2-0.5B model (24 layers) and have observed a fascinating compilation and runtime pattern. The model is compiled into 6 separate IRs (3...

Performance Instability in LLM Inference on Lunar Lake NPU

I observed an approximate 5% latency variation (using the raw_metrics.m_inference_durations metric) during inference with Large Language Models, such as Qwen2-0.5B, on the Lunar Lake NPU. This instability persists even after...