Liger-Kernel
Liger-Kernel copied to clipboard
Request to support the Flux model (T2I diffusion transformer)
🚀 The feature, motivation and pitch
This request is to adapt this to improve the training speed of Flux, a diffusion transformer.
It's the top model on HuggingFace trending right now and has been for two weeks, but it's very difficult to train as it's 12B parameters. The methods will also be useful for other ViT and DiT models in multimodal uses.
https://huggingface.co/black-forest-labs/FLUX.1-dev https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/transformers/transformer_flux.py
Alternatives
No response
Additional context
No response
#take @ByronHsu I’d like to make an attempt