pengyige123

Results 2 issues of pengyige123

### Describe the bug The low-noise model weights trained in WAN2.2 are then used for further low-noise VSA training, and the loss becomes Nan. Our investigation revealed that the Nan...

### Describe the bug The phenomenon is as follows: vsa When using VSA, each step takes 11.91 seconds. Fa When using Flash attention, each step takes 9.39 seconds. thank you...