oldpan issues

Results 6 issues of


                                            oldpan

How to add Ctrl+c Ctrl+v Ctrl+A function?

Thanks for this brilliant project！ I think if I can copy and paste some text on imgui-ws . But i tried and failed. Take a look at source code: ```cpp...

enhancement

question

关于使用TensorRT-API搭建与onnx-tensorrt parser的性能差距问题

很感谢你们的分享~干货很多，有个小疑问想请教下： - 这一句中“除此之外我们发现网络结构中存在大量的Transpose+Reshape结构，这些结构是可以通过TensorRT API在设计网络的时候被合并的” ，基于API的搭建方式我理解的是你们去掉了一些多余的reshape操作（等价实现reshape但是用了更少的trt-layer），不过基于parse的方式搭建完network之后（network中包含Transpose->Reshape），会被内部trt优化成`Transpose+Reshape`结构，就和你使用nsight sys展示的一样，这其实已经合并多余的reshape/transpose吧，这个和直接使用API的方式合并，性能有差别吗？ - 我理解的基于API和基于Parse的本质区别就是可以避免一些onnx的胶水、碎片算子，通过trt-plugin的方式修改onnx模型（将碎片算子合并为一个，比如layernorm）然后通过parse+plugin的方式转模型，应该和直接API+plugin的性能是一样的吧？希望可以和大佬交流下！

fx2ait low performance

When I run `AITemplate/fx2ait/fx2ait/example/02_vision_model/test_vision_model.py` the performace is strange: ```bash BS: 1, PT Eager time per iter: 0.0008954061126708984ms, PT Eager QPS: 1116.81, FX2AIT time per iter: 0.0008138240051269531ms, FX2AIT Eager QPS: 1228.77,...

[Feature] 是否支持enc-dec类型模型中decoder的persistent batch

### Motivation 我们有一些多模态模型，比如nougat是由一个vision encoder模型和llm decoder模型组成的. 其中encoder模型就是传统的cv模型，类似于vit用于提取图像的特征为encoder_hidden_feature，然后再传入decoder中，这个时候decoder中开始传入初始input_id和encoder_hidden_feature，decoder中会有cross attention的部分； encoder部分可以忽略，主要是decoder部分，这部分支持 persistent batch 吗，这个decoder的输入对比传统的llm-decoder还会额外有 encoder_hidden_feature 输入，会在decoder中进行cross attention。目前static batching在trt-llm可以的，但是如果想要提升性能，想问lmdeploy是否支持类似这种decoder的persistent batch？ ### Related resources _No response_ ### Additional context _No response_

awaiting response

InternLM2 encounters a error when the batch size exceeds 16

# system info - x64 - 23.10-trtllm-python-py3 docker - trt-llm 0.11.0.dev2024061800 - l40s - tensorrt 10.0.1.6 - use 38g * 2 - running in container (--cap-add=SYS_PTRACE --cap-add=SYS_ADMIN --security-opt seccomp=unconfined --ipc=host...

Investigating

functionality issue

When running deepseek-r1 with Ollama, it intermittently outputs garbage

### What is the issue? When I was running deepseek-r1 using Ollama, I occasionally encountered some garbled output, such as ';0?!18=1C%-DB', but sometimes it worked correctly. What could be the...

bug