Yi-sir comments

Results 8 comments of


                                            Yi-sir

Local intall failure

I' m building tei with python backend on a Ubuntu20.04 machine without nvidia device and meet a similar problem, ``` --- stderr thread 'main' panicked at /home/xyz/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/aws-lc-sys-0.28.0/builder/cc_builder.rs:492:13: ### COMPILER BUG...

关于Llama model split的疑问

牛的，manually。谢谢

关于Llama model split的疑问

再请教一下，框架里为什么要手动做切分和加载呢（而不是直接用from_pretrained）？我看到ByteMLPerf的modeling_llama.py里面mlp/attn之类的权重shape考虑了mp_size，但transformers库里没有这些操作。transformers也支持Multi-GPU Inference，加载模型的部分也用了一些加速库处理分布式的情况。请问ByteMLPerf里切分和加载权重的方式相比transformers的方式有什么优势吗？

Add TPU backend for general_perf and llm_perf

3. 支持了resnet50-torch-fp32 int8量化，需要交互设置量化参数。支持双芯异步perf

Add TPU backend for general_perf and llm_perf

[75eb268](https://github.com/bytedance/ByteMLPerf/pull/119/commits/75eb268da25c4501b5d42cec94d32221666177db) 这笔增加了kv cache，但需要修改transformers里的一些代码才能运行。 ![image](https://github.com/user-attachments/assets/5aa2d2b2-b86f-433c-b8a7-90cc8e611780) ![image](https://github.com/user-attachments/assets/d542e794-4dcf-458d-8ff7-75b3ce4f5ea1)

[Bug]: Failed to get segment descriptor for segment localhost:15465 While running example code in docs/source/getting_started /quick-start.md

Env: lmsysorg/sglang:v0.5.3-cu129

[Bug]: Failed to get segment descriptor for segment localhost:15465 While running example code in docs/source/getting_started /quick-start.md

But sglang pd disaggregation works well with the mooncake backend, I wonder what is the difference

[Bug]: Failed to get segment descriptor for segment localhost:15465 While running example code in docs/source/getting_started /quick-start.md

> It seems that the server side doesn't correctly register the memory. Remove line 59 and have a try? sorry, do you mean these lines? if PROTOCOL == "rdma": ret_value...