LIRENDA621
LIRENDA621
I observed that your repository does not seem to provide scripts for multi-gpu training. May I ask if your current code can only train on a single gpu?
请问二阶段加载animatediff的motion model进行finetune的时候,训练刚开始时模型去噪的结果是接近于噪音吗?我训练时是这个结果。但是如果不加载mm的预训练权重,由于零卷积的存在,训练刚开始时模型去噪的结果是正确的人像(也就是一阶段训练出的结果) When loading animatediff's motion model for finetune in the second stage, is the model denoising result close to noise at the beginning of training? This is the result...
rot6D is a motion representation commonly used in co-speech gesture generation tasks. In the code, the author sets convert2rot6D to False. What is the reason?
The configuration file in the folder (./experiments) in the pre-training weights folder is inconsistent with the one in ./talkshow/config。 e.g. ./TalkSHOW/experiments/2022-11-02-smplx_S2G-body-pixel-3d/smplx_S2G.json VS ./TalkSHOW/config/body_pixel.json
torch版本2.5.1 transformers版本4.49.0.dev0 judge模型gpt-4o-turble generation_config: top_p=0.001, top_k=1, temperature=0.01, repetition_penalty=1.0, **自测结果:** AI2D: 0.7836 **DynaMath (worst case overall): 0.067** HallusionBench: 42.49 MMU(val): 0.48 MMStar: 0.544 **官方结果:** AI2D: 0.814 **DynaMath (worst case overall): 0.132**...
我在用Qwen2.5-VL-Instruct-3B模型评测时,在很多数据集的post_check函数中遇到下面的警告,例如MathVision mathv.py: post_check - 128: : signal only works in main thread of the main interpreter 不清楚为什么会出现这个警告,以及这个警告是否会影响评测的精度?
1、OpenCompass排行榜的指标是45,但是我们本地测试只有41.30 2、这个差距不是由评判模型造成的。因为需要评判模型处理的'unknown'预测只有14个问题,而这14个问题本身就不是Yes/No问题,我参考了官方给出的预测结果,这14个问题同样回答错误。
You used 73 datasets to train Ovis1.5-Gemma2-9B-S3, but only 71 datasets to train Ovis1.5-Llama3-8B-S3. Why is this the case? Is it a typo or is there another reason?
When I download the laion-coco dataset via the url, about half of the downloads fail. Is this normal? Also, I want to know what download failure and resize failure mean