Yixiao Chen comments

Results 7 comments of


                                            Yixiao Chen

[Bug] Ascend v0.7.2.post1，对serving api测速，概率性卡死

各位大佬，我升级到了这个pr以后，离线多卡昇腾推理pipeline还是会卡死。有大佬遇到过离线的推理卡死吗？使用的lmdeploy版本 0.8.0（具体提交是13b2b5c74ec1d80ec26ee4b8bbcdaec87f406f6c）和dlinfer 0.1.8（具体提交是cf7b6e362c7d13f26be42708fb690cb4354b2eef）具体离线推理模式是开启一个pipeline（internlm2.5-7b-chat，tp=2，昇腾910B，eagermode和graphmode都试了会卡死），然后一个一个去pipe 200条prompt，每条prompt pipe两遍。在pipe200条的过程中有相当大的概率卡死，HCCL_EXEC_TIMEOUT以后同样报ACL stream synchronize的507048号错误。

[Bug] Ascend v0.7.2.post1，对serving api测速，概率性卡死

> 尝试下升级cann到8.1.beta1以上，包括kernel。主要还是用graphmode 还有问题请给一个完整的复现，我们这里再看看。了解谢谢，我现在是公司的公用cann 8.0.0，过几天我想办法升级一把

[Bug] 开启prefix cache后，有时对于同prompt推理的结果不一样

Thank you! Sure, [qwen_tests.zip](https://github.com/user-attachments/files/19430270/qwen_tests.zip) This file includes my test scripts and outputs for your prompt (`qwen_angelo.py` for script and `qwen_angelo.txt` for output), and my long prompt (`qwen_harry.py` for script and...