Fredy
Fredy
Increase the batch size to 16 or 32, the epochs maybe to 25 or 30, increase the sq_len to 512 or 1024, the d_model to 768 or 1024. And make...
Ok, I'm going to do some training tests, and then I'll let you know if it still has multimodality
Size test: Small model: ``` ⚡ ~ python main.py training autoencoder loss: 0.38245: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [00:31
Hey, I'm going to do a test by increasing the dimension of the autoencoder along with the model, maybe it's because of this discrepancy in dimensionality that causes this error.
I already did the test and it still seems to lose that multimodality, even though the auto encoder and Transfusion share the same dimensionality: ``` model = Transfusion( num_text_tokens=256, #...
Hey everyone, I need to know if the changes from this pull are already integrated into the main branch and if it's already on Pypi. I have a RAG project...
Ok, I'm going to move the wrappers to a separate file, and it was just an update to add wrappers to the VAE and the ImageProccesor to avoid memory leaks...
Heeyyy @sayakpaul I already moved the wrappers to wrappers.py, check it out :b
Note: I'm including the batch inference implementation from diffusers for your reference. I hope it helps! :D Wan2.2: https://github.com/huggingface/diffusers/blob/262ce19b/src/diffusers/pipelines/wan/pipeline_wan.py (The important sections are lines 191 to 194, lines 384 to...