冯祥卫 comments

Results 10 comments of


                                            冯祥卫

[Feature] Stable Video Diffusion Training Code

👍

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 6 but got size 5 for tensor number 1 in the list.

have you sovled it?I meet same problem.

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 6 but got size 5 for tensor number 1 in the list.

Hi, its related to the size of the input image.

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 6 but got size 5 for tensor number 1 in the list.

> try to multiple of 64.

svd的微调脚本可以用在25-frames的视频数据上么？

> 我在用Accelerator的deepspeed做u-net微调时，即使batch_size=1，仍会出现显存溢出多大显存啊

Question about classifier guidance for image in training code

https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_video_diffusion/pipeline_stable_video_diffusion.py， it seems the SVD inference code only adopt the cfg of image prompts.

Question about classifier guidance for image in training code

refer 3.2.3 of Hierarchical Masked 3D Diffusion Model for Video Outpainting. I think this training code adopt two cfg so that correspondent changes should be in the inference stage.

after training on 512x512, the video not move always, why?

you may can try to train unet.

视频描述大量出现”watermark“和”bilibili“

我也遇到了同样的问题，在干干净净的视频出现了 watermark 和 bilibli，估计是训练数据的问题。