oldpan comments

Results 11 comments of


                                            oldpan

How to add Ctrl+c Ctrl+v Ctrl+A function?

Thanks for great work! I'll have a try.

'file' is incompatible with interactive python

Sorry for that. I didn't think about ipython condition.I only used it in normal python program. Maybe you can fix it or fork this project to finish on you own....

.txt in Ubuntu 16.04

Did your code run with no error ? Just can't output the .txt ? I'm not sure whether your path string is right or not. Can you tell me more...

Why **2

Hi. That's because we will get some number by MB and 1 MB is 1000\**2 B (or 1024**2 B). You can take a look at [this ](https://oldpan.me/archives/pytorch-gpu-memory-usage-track)~

关于使用TensorRT-API搭建与onnx-tensorrt parser的性能差距问题

嗯嗯，感谢回复哈，确实是这样的，如果时间允许的话，直接Tensorrt-API是最好的方法

TypeError: integer argument expected, got float

I think the reason is that, you are using python2.7 where the code 2/4 will be 0(will be 0.5 in python3.6). ... By the way, the code is a mess...

[Feature] 是否支持enc-dec类型模型中decoder的persistent batch

@lvhan028 @lzhangzz 感谢回复，在nougat中，encoder输出的feature会和初始input_ids一同传入decoder中，在docoder内部是这么操作的： ![image](https://github.com/InternLM/lmdeploy/assets/9389127/d00264ef-9997-4cf2-9c31-8762f0bd1d3e) 这里有两个kv cache以及两个attn

Does TensorRT-LLM support passing input_embeds directly？

我也好奇这个input_embeds如何直接传，不确定你这里直接传input_embeds的具体需求是什么，是否和我一样。不过InternVL2这个可以使用trt-llm跑起来，使用pre + img + post拼prompt的形式。这个token id是在输入trt-llm之前确定好，实际输入trt-llm decoder engine的时候，和图像的visual_feature一起传入decoder engine，input_ids在其中进行embed后和visual_feature一起concat，这个是可以实现的。 I'm also curious about how input_embeds can be directly passed. I'm not sure about the specific requirement for directly passing...

Does TensorRT-LLM support passing input_embeds directly？

> > 我也好奇这个input_embeds如何直接传，不确定你这里直接传input_embeds的具体需求是什么，是否和我一样。不过InternVL2这个可以使用trt-llm跑起来，使用pre + img + post拼prompt的形式。这个token id是在输入trt-llm之前确定好，实际输入trt-llm decoder engine的时候，和图像的visual_feature一起传入decoder engine，input_ids在其中进行embed后和visual_feature一起concat，这个是可以实现的。 > > I'm also curious about how input_embeds can be directly passed. I'm not sure about the specific...

Does TensorRT-LLM support passing input_embeds directly？

> @Oldpan internvl2-2B 跑起来推理总是输出max_token数，这是为什么我猜是end_id没设对