Great work! Any plan to train a smaller version, e.g. around 3B?
Hello,
It's a really great work which contributes a lot to the community!
Do you have any plan to train a smaller version of large world model (e.g., 1~3B), which may be based on smaller models like Phi-2? It should be much easier and use less computing resources.
If other researchers have such plan, please reply and we may work together!
Thanks for your interest. We don't have plans to train a smaller model at the moment
@StarCycle This is an amazing project but, I'm just going to try to load it 8bit (i don't even know if it will work). I have a 4070ti, never loads f16 let alone 32 for 7B models. If there was a way for the community to pinch in and help you guys to do the training on tinyllama or phi3 it would be awesome. I have no idea how much it would cost, I don't think it's cheap or affordable. If it's any of the two I'm jumping in.
@StarCycle This is an amazing project but, I'm just going to try to load it 8bit (i don't even know if it will work). I have a 4070ti, never loads f16 let alone 32 for 7B models. If there was a way for the community to pinch in and help you guys to do the training on tinyllama or phi3 it would be awesome. I have no idea how much it would cost, I don't think it's cheap or affordable. If it's any of the two I'm jumping in.
Hi @befman123, I tried to generate video with LWM. It needs a quite large GPU memory to achieve that (I had to use A100 80G or H100). After 3 minutes, I got a video with 2 second and the quality is bad. Btw, I am not familiar with Jax though I hear that Jax is quite efficient even on Nvidia GPU.
I think maybe we can wait for Meta to make their Chameleon open source, which is similar to LWM (text and image generation with LLM + VQGAN encoder/decoder, without video generation). Bytedance also made their VAR open-source. The smallest version of VAR only has 310M parameters. The best news: they are written in pytorch.
Perhaps you can first start with finetuning these pytorch model? If you like, we can set up a discord server first and check how many people are also interested in it!