Tony Lee comments

Results 10 comments of


                                            Tony Lee

Incorporate Normal Maps into the renderer

@timlod Hi, can you share how to optimize normal maps (along with specular and roughness for PBR) in pytorch3d, I want to optimize these material maps by inverse rendering, then...

how to render a image with the R,t and the camera K?

> Redner uses normalized device coordinates for the calibration matrix, so you have to transform it accordingly. Also, the image y-axis points downwards in the OpenCV convention but seems to...

Use DiffSynth-Studio to train i2v model based one wanx1.3 t2v model

> [@lith0613](https://github.com/lith0613) If you wish to train the 1.3B t2v model into an i2v model, we recommend that you take a close look at our backend code and make modifications...

sageattention assert headdim in [64, 96, 128], "headdim should be in [64, 96, 128]."

> [@zhaoyun0071](https://github.com/zhaoyun0071) Thanks for reporting this issue. We will debug and fix it. hi, have you solved this bug ?

Finetune时GPU利用率波动很大

> 因为你是多卡zero_2的状态，然后80G是正常的，20G因该是在读数据和传输吧，这时候没有激活模型的训练部分，只是load状态请问有什么设置可以让显存一直维持固定的数值吗，正常训练其它模型应该是不会变化的

训练速度超级慢，请问要怎么设置才会提速呢？

> 在的方案已经是比较高的速度了，目前暂时没有太多提速的方案，我们之后会尝试使用diffusers 和 PEFT框架进行微调，如果我们有更新方案，我们会第一时间更新。感谢您的回复，请问我目前这个训练速度和你们自己训练的差异明显吗，你们自己训练大概是什么耗时呢

训练速度超级慢，请问要怎么设置才会提速呢？

> > 感谢开源代码！目前在A100训练，关掉gradient_checkpoint，frames设置为33，其它设置保持官方代码不变，在6卡并行训练的时候单个iteration耗时约24秒，但是GPU利用率持续保持在100%的状态，请问哪里有什么别的设置能提高训练速度吗？ > > 每张卡的batch size多大只能1，超过就OOM了

> > > > 感谢开源代码！目前在A100训练，关掉gradient_checkpoint，frames设置为33，其它设置保持官方代码不变，在6卡并行训练的时候单个iteration耗时约24秒，但是GPU利用率持续保持在100%的状态，请问哪里有什么别的设置能提高训练速度吗？ > > > > > > > > > 每张卡的batch size多大 > > > > > > 只能1，超过就OOM了 > > 确定gradient_accumulation_steps是1吗，以及分辨率是多大分辨率是480*720，gradient_accumulation_steps是1，这些参数都是默认的，没有修改过，请问您那边训练是大概什么耗时呢，我刚又看了下是21秒左右，24是因为保存checkpoint花了些时间

Why the use of FP16 instead of BF16 precision?

> Pro will use BF16 请问代码里面要想改成bf16做finetune要怎么修改配置呢

support 14B full training？

> [@zsp1993](https://github.com/zsp1993) Done. Now you can use `--use_gradient_checkpointing_offload` and `--training_strategy "deepspeed_stage_3"` to train 14B T2V model using 8 A100 (8*80G VRAM) GPUs. still oom when train 14b i2v model