Lyu Han issues

Results 24 issues of


                                            Lyu Han

[Feature] wrap libtorch as net module in MMDeploy SDK

**Describe the feature** As described in the title, this feature requests developing NetModule by wrapping libtorch in MMDeploy SDK. Therefore, we can do torchscript model inference via SDK API **Motivation**...

help wanted

planned feature

[Enhancement] make build script for windows platform

**Describe the feature** Write a script to build mmdeploy under windows platform. **Motivation** make windows build easier.

enhancement

help wanted

update doc about codebase deployment

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily receiving feedbacks. If you do not understand...

WIP

add batch inference in demo

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily receiving feedbacks. If you do not understand...

enhancement

fix typo

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...

[Doc] Add `projects` section in README which is developed based on FasterTransformer

It is noted that some issues(#506 #729 #727) are requesting FasterTransformer to support Llama and Llama-2. Our project [LMDeploy](https://github.com/InternLM/lmdeploy) developed based on FasterTransformer, has supported them and their derived models,...

add benchmark script to profile pipeline APIs

after #1507 addressing issue #1494 python >= 3.10 is recommended.

improvement

Format supported model table using html syntax

documentation

[Benchmark] benchmarks on different cuda architecture with models of various size

# 背景我们发现绝大部分LLM推理引擎在报告推理性能的时候，都是关掉sampling功能的。但是在实际应用中，sampling几乎是必选项。为了给出尽可能贴近实际应用的benchmark，我们开了这个issue，报告 LMDeploy **在采样开启时**候的性能。 # 测试模型 1. llama2-7b 2. llama2-13b 3. internlm-20b 4. llama2-70b # 测试设备 1. A100 模型计算精度：BF16（FP16)、W4A16、KV8 2. V100 模型计算精度：FP16 4. 4090 模型计算精度：W4A16 5. 3090 模型计算精度：W4A16 7....

Add deployment guide

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...