ppl.llm.serving issues

Compilation error

## What are the problems?(screenshots or detailed error messages) In file included from /home/liuxiandong/workspace/ppl/ppl.llm.serving/src/models/llama/llama_worker.h:25:0, from /home/liuxiandong/workspace/ppl/ppl.llm.serving/src/models/llama/llama_worker.cc:18: /home/liuxiandong/workspace/ppl/ppl.llm.serving/src/models/llama/../../utils/mpsc_request_scheduler.h:21:10: fatal error: ppl/common/event_count.h: No such file or directory #include "ppl/common/event_count.h" ^~~~~~~~~~~~~~~~~~~~~~~~~~ In file...

Liu-xiandong

关于性能分析的一点疑惑

1

## What are the problems?(screenshots or detailed error messages) 想问下有性能分析的工具嘛？profiler相关，还是只能用nsight profile这种自己去看一些算子性能 ## What are the types of GPU/CPU you are using? GPU：A100-80G-SXM4 ## What's the operating system ppl.llm.serving runs on?...

Zhiy-Zhang

Error for llama-13B on V100

3

An error was encountered while executing client_qps_measure. Platform: llama-13B on 2 V100 GPUS ``` [INFO][2023-09-13 03:35:21.764][llama_server.cc:539] max_tokens: 75630 [INFO][2023-09-13 03:35:21.827][llama_server.cc:484] VOCAB_SIZE: 32000; BOS ID: 1; EOS ID: 2; PAD ID:...

yisongsong

enhancement

编译出错 [ 17%] Built target crypto as: symbol lookup error: as: undefined symbol: deflate

2

## What are the problems?(screenshots or detailed error messages) ## What are the types of GPU/CPU you are using? ## What's the operating system ppl.llm.serving runs on? ## What's the...

af-74413592

支持qwen1.5或者qwen2吗？

1

用 ppl.pmx Export 导出模型，有大量的警告， Warning: The shape interface of opmx::XX（如 ParallelEmbedding、ColumnParallelLinear、Reshape等） type is missing，用转出来的 onnx 格式的文件启动 ppl_llm_server，提示 unsupported op: domain[opmx], type[ParallelEmbedding]

Flynn-Zh

How to generate custom dataset?

## What are the problems?(screenshots or detailed error messages) I need to benchmark llama 2 7b time to first token(ttft) with openppl, and I have to benchmark it with static...

trebladev

update git deps

1

This PR will fix the compile error due to: ``` /home/xxx/workspace/ppl/ppl.llm.serving/src/models/llama/../../utils/mpsc_request_scheduler.h:21:10: fatal error: ppl/common/event_count.h: No such file or directory #include "ppl/common/event_count.h" ^~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. ```

syheliel

[安全漏洞]OpenPPL的prefix block cache实现存在缓存污染问题

具体所指向文件为：https://github.com/OpenPPL/ppl.llm.serving/blob/master/src/utils/prefix_cache_manager.h 由于使用的是xxhash64，所以不具备抗碰撞性，攻击者可以构造一个Hash(a)=Hash(b)从而污染缓存。攻击者使用恶意提问A来污染缓存，用户使用B进行提问的时候会返回A的回答。该漏洞的原理近似于VLLM（CVE-2025-25183），可以采用相似的方法来进行修复。

kexinoh

PPL.LLM的全称是什么？

## What are the problems?(screenshots or detailed error messages) ## What are the types of GPU/CPU you are using? ## What's the operating system ppl.llm.serving runs on? ## What's the...

SendoRay

使用offline_inference测试llama2_7b时，会在执行”column_parallel_linear_kernel“时报错

## What are the problems?(screenshots or detailed error messages) 使用offline_inference测试llama2_7b时，会报如下错误： ””“ [LLMCUDA][pmx/rms_norm_kernel.cc:84] |-DataFormat: NDARRAY [LLMCUDA][pmx/column_parallel_linear_kernel.cc:29] Entry LlmCudaKernel: [/layers.0/w [LLMCUDA][pmx/column_parallel_linear_kernel.cc:36] Input [input]: [LLMCUDA][pmx/column_parallel_linear_kernel.cc:37] TensorName: [/layers.0/attention [LLMCUDA][pmx/column_parallel_linear_kernel.cc:37] |-Data: 0x1120000000 [LLMCUDA][pmx/column_parallel_linear_kernel.cc:37] |-DimCount: 2...

Nepths

ppl.llm.serving
ppl.llm.serving copied to clipboard

Metadata

Compilation error

关于性能分析的一点疑惑

Error for llama-13B on V100

编译出错 [ 17%] Built target crypto as: symbol lookup error: as: undefined symbol: deflate

支持qwen1.5或者qwen2吗？

How to generate custom dataset?

update git deps

[安全漏洞]OpenPPL的prefix block cache实现存在缓存污染问题

PPL.LLM的全称是什么？

使用offline_inference测试llama2_7b时，会在执行”column_parallel_linear_kernel“时报错

← Metadata

Owner

Metadata

ppl.llm.serving ppl.llm.serving copied to clipboard

Metadata

← Metadata

Owner

Metadata

ppl.llm.serving
ppl.llm.serving copied to clipboard