mT0-xxl finetuning
Hello! Thanks a lot for your job! I'm using mT0-xxl for question answering task, however it performs with not so high quality I expected it to do. So I'm trying to finetune the model a little bit. If I understood correctly, first of all I should get checkpoint and gin file for the model I want to finetune. Could you please share with these? And is it possible to finetune it with torch or tf is the only way?
Hey there are some more details on mT0 fine-tuning here: https://github.com/bigscience-workshop/xmtf/issues/12 The config is here: https://github.com/bigscience-workshop/xmtf/issues/6#issuecomment-1366147205
Thanks for reply! Will try mentioned config.
Hey @Muennighoff , It's seems that I still can't get a couple of things. Would be very appreciate If you could give me a hand here. Well, I need to finetune your model mT0-xxl (not the initial T5X-xxl), so accordingly to the manual https://github.com/google-research/t5x/blob/main/docs/usage/finetune.md I need 3 components (excluded SeqIO Task, which is clear as for now) to proceed:
- Checkpoint -- Could you please share with mT0-xxl checkpoint? In the manual all used checkpoints are the TensorFlow weights etc, but on the HuggingFace there are only PyTorch weights. So I do need either mT0-xxl checkpoint in TensorFlow or finetune the model in PyTorch (is it even possible?)
- Gin file for the model to finetune (mT0-xxl in the case) -- Could I use the default one like https://github.com/google-research/t5x/blob/main/t5x/examples/t5/mt5/xxl.gin?
- Gin file configuring finetuning process -- I write it by my own based on https://github.com/google-research/t5x/blob/main/t5x/configs/runs/finetune.gin with some overrides, right? Please, correct me if I wrong in some points.
There's a t5x ckpt here: https://huggingface.co/bigscience/mt0-t5x I don't remember which size that model is though; I don't have the other ones anymore, maybe @adarob does
For 2. & 3., yes I think so
This does appear to be XXL
On Thu, May 18, 2023 at 5:02 AM Niklas Muennighoff @.***> wrote:
There's a t5x ckpt here: https://huggingface.co/bigscience/mt0-t5x I don't remember which size that model is though; I don't have the other ones anymore, maybe @adarob https://github.com/adarob does
For 2. & 3., yes I think so
— Reply to this email directly, view it on GitHub https://github.com/bigscience-workshop/xmtf/issues/19#issuecomment-1552755263, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIJV2APKVDRBUQIFQ6JKQ3XGXQQZANCNFSM6AAAAAAYCE4Z34 . You are receiving this because you were mentioned.Message ID: @.***>
Thanks a lot, guys!