DiffSynth-Studio [wan] [lora] train a longer wanvideo model

Great thanks to the diffsynth work! I found the base model Wan2.1_1.3B_t2v and Wan2.1_14B_t2v are designed to generate 81 frames of video. Even though I set the num_frames to 161, I found the generated video not so well(some flash shadow appeared and character move strangely). So I was wondering if someone has trained a longer version of t2v using diffsynth lora mode. Is there any mechanism in wan2.1 that causes it to only generate 81 frames of video? If I want to generate longer videos, is full training required, or can it be achieved through LoRA training?

Mar 30 '25 14:03 AnkerLab

@AnkerLab Thanks for your ideas. Done. We have trained this lora: https://modelscope.cn/models/DiffSynth-Studio/Wan2.1-1.3b-lora-exvideo-v1

Apr 03 '25 02:04 Artiprocher

@AnkerLab Thanks for your ideas. Done. We have trained this lora: https://modelscope.cn/models/DiffSynth-Studio/Wan2.1-1.3b-lora-exvideo-v1

Would you have LORA support for the remaining 3 WAN models also? Or, I understand and have done LORA training so what kind of dataset and settings do you use to create a LORA to allow longer videos?

Apr 08 '25 23:04 lvang77

Following up on this, does the repo support training exvideo on wan?

May 28 '25 12:05 Emanuele97x