[wan] [lora] train a longer wanvideo model
Great thanks to the diffsynth work! I found the base model Wan2.1_1.3B_t2v and Wan2.1_14B_t2v are designed to generate 81 frames of video. Even though I set the num_frames to 161, I found the generated video not so well(some flash shadow appeared and character move strangely). So I was wondering if someone has trained a longer version of t2v using diffsynth lora mode. Is there any mechanism in wan2.1 that causes it to only generate 81 frames of video? If I want to generate longer videos, is full training required, or can it be achieved through LoRA training?
@AnkerLab Thanks for your ideas. Done. We have trained this lora: https://modelscope.cn/models/DiffSynth-Studio/Wan2.1-1.3b-lora-exvideo-v1
@AnkerLab Thanks for your ideas. Done. We have trained this lora: https://modelscope.cn/models/DiffSynth-Studio/Wan2.1-1.3b-lora-exvideo-v1
Would you have LORA support for the remaining 3 WAN models also? Or, I understand and have done LORA training so what kind of dataset and settings do you use to create a LORA to allow longer videos?
Following up on this, does the repo support training exvideo on wan?