LlamaGen
LlamaGen copied to clipboard
Training Hyperparameters of t2i model training
Hi there! Thanks for your great work on the T2I model. I'm currently studying its training process and have some questions about the Stage 1 training (on LAION-50M dataset). Could you kindly share the related details? It would be a huge help for my learning/reproduction work.
My questions are as follows:
- How many epochs were used for Stage 1 training on LAION-50M?
- Was multinode training adopted for Stage 1?
- If yes: How many nodes were used, and what was the total training time?
- Could you also list other key training hyperparameters for Stage 1 (e.g., batch size per GPU, learning rate, optimizer type, image resolution, etc.)?
Thanks again for your time and help!
@leileqiTHU Hi, Have you figured out what the .jsonl file is in autoregressive/train/extract_codes_t2i.py? Thanks.