LIRENDA621 issues

Results 9 issues of


                                            LIRENDA621

Questions about Distributed Training

I observed that your repository does not seem to provide scripts for multi-gpu training. May I ask if your current code can only train on a single gpu?

Question about Finetue MotionModel

请问二阶段加载animatediff的motion model进行finetune的时候，训练刚开始时模型去噪的结果是接近于噪音吗？我训练时是这个结果。但是如果不加载mm的预训练权重，由于零卷积的存在，训练刚开始时模型去噪的结果是正确的人像（也就是一阶段训练出的结果） When loading animatediff's motion model for finetune in the second stage, is the model denoising result close to noise at the beginning of training? This is the result...

Why not use rot6D as motion representation, but use SMPL-X parameters to represent it?

rot6D is a motion representation commonly used in co-speech gesture generation tasks. In the code, the author sets convert2rot6D to False. What is the reason?

Configuration files are inconsistent

The configuration file in the folder (./experiments) in the pre-training weights folder is inconsistent with the one in ./talkshow/config。 e.g. ./TalkSHOW/experiments/2022-11-02-smplx_S2G-body-pixel-3d/smplx_S2G.json VS ./TalkSHOW/config/body_pixel.json

Qwen2.5-VL-3B-Instruct在很多评测数据集的指标与官方榜单差距较大

torch版本2.5.1 transformers版本4.49.0.dev0 judge模型gpt-4o-turble generation_config: top_p=0.001, top_k=1, temperature=0.01, repetition_penalty=1.0, **自测结果：** AI2D: 0.7836 **DynaMath (worst case overall): 0.067** HallusionBench: 42.49 MMU(val): 0.48 MMStar: 0.544 **官方结果：** AI2D: 0.814 **DynaMath (worst case overall): 0.132**...

LIRENDA621

Questions about Distributed Training

Question about Finetue MotionModel

Why not use rot6D as motion representation, but use SMPL-X parameters to represent it?

Configuration files are inconsistent

Qwen2.5-VL-3B-Instruct在很多评测数据集的指标与官方榜单差距较大

很多评测数据集的post_check函数报错

Ovis1.5-Llama3-8B在Hallusion Bench上的指标和榜单上的指标差距过大

Why use different dataset for Training Ovis1.5-Gemma2-9B-S3 and Ovis1.5-Llama3-8B-S3

Download Laion-coco lots of failed