No One comments

Results 13 comments of


                                            No One

dbnet输出维度变了

> 原版的model_0.88_deploy.onnx权重的输出是四个维度的，shape为[1, 2, h, w]，但是您网盘上的权重输出只有两个维度，shape为[h, w]，看模型结构发现您的网络除了去掉了BN层之外，在网络的最后还加了reshape、gather、reshape三个操作改变了输出维度。您可以给出修改之后的网络代码或者这三个操作的参数和实现方式吗？期待您的回复！您好！请问您知道怎么修改了吗？这个问题也一直困扰着我

[Bug] VL2.5-8B有bug

> 在我的推理过程中，有相当多的“r”。。。不知道为什么不论是用 lmdeploy，还是用 transformers 关于 8B 的模型截图没有，8B 的模型真的是一堆“r”，甚至我以为 26B 的模型会没有这个现象，但还是存在，如图下所示。 > > 我也遇到了类似的问题，使用1B时效果反而更高

CogVLM2 Videos demo

Thank you for your reply. The old problem has been solved, but a new problem has arisen. huggingface/tokenizers: The current process just got forked, after parallelism has already been used....

CogVLM2 Videos demo

> try this: `pip install transformers==4.40` I tried using transformers 4.40.0; Whether I use quant8 or quant4, the following problem is thrown: >>---------- Run CUDA_VISIBLE_DEVICES=2 python cli_video_demo.py --quant 8 ------------------------------...

CogVLM2 Videos demo

> I found v100 couldn't support bf16, and demo uses fp16. I wish it can help Thank you for your reply! I found that fp16 seems to have some problems....

Customize my own model

Thank you for your reply! I just simply ran a demo of a custom model (actually almost the same as Qwen2.5-VL), and I also compared the forward implemented by transformers...

Customize my own model

Great! I also found that when testing forward, I shouldn’t add `if pixel_values is None and pixel_values_videos is None: # Setting dummy pixel_values for avoid deepspeed error. dummy_pixel = torch.zeros(14308,...

Customize my own model

Oh! I also found a problem. When using my own customized model for training, the loss becomes very large, but the forward process is actually the same as qwen2_5_mixed_modality_forward, and...

Customize my own model

So Great! In fact, I hope to add some projectors to the forward process of Qwen2.5-VL to complete the tasks of other modes, so I may need to customize my...

Customize my own model

Yes, yes, I am doing some new attempts, so I want to add some modules between vision-encoder and llm. I will collect the data that is now public, but the...