xmtf icon indicating copy to clipboard operation
xmtf copied to clipboard

mT0-xxl finetuning

Open sh0tcall3r opened this issue 2 years ago • 6 comments

Hello! Thanks a lot for your job! I'm using mT0-xxl for question answering task, however it performs with not so high quality I expected it to do. So I'm trying to finetune the model a little bit. If I understood correctly, first of all I should get checkpoint and gin file for the model I want to finetune. Could you please share with these? And is it possible to finetune it with torch or tf is the only way?

sh0tcall3r avatar May 15 '23 12:05 sh0tcall3r

Hey there are some more details on mT0 fine-tuning here: https://github.com/bigscience-workshop/xmtf/issues/12 The config is here: https://github.com/bigscience-workshop/xmtf/issues/6#issuecomment-1366147205

Muennighoff avatar May 16 '23 08:05 Muennighoff

Thanks for reply! Will try mentioned config.

sh0tcall3r avatar May 16 '23 09:05 sh0tcall3r

Hey @Muennighoff , It's seems that I still can't get a couple of things. Would be very appreciate If you could give me a hand here. Well, I need to finetune your model mT0-xxl (not the initial T5X-xxl), so accordingly to the manual https://github.com/google-research/t5x/blob/main/docs/usage/finetune.md I need 3 components (excluded SeqIO Task, which is clear as for now) to proceed:

  1. Checkpoint -- Could you please share with mT0-xxl checkpoint? In the manual all used checkpoints are the TensorFlow weights etc, but on the HuggingFace there are only PyTorch weights. So I do need either mT0-xxl checkpoint in TensorFlow or finetune the model in PyTorch (is it even possible?)
  2. Gin file for the model to finetune (mT0-xxl in the case) -- Could I use the default one like https://github.com/google-research/t5x/blob/main/t5x/examples/t5/mt5/xxl.gin?
  3. Gin file configuring finetuning process -- I write it by my own based on https://github.com/google-research/t5x/blob/main/t5x/configs/runs/finetune.gin with some overrides, right? Please, correct me if I wrong in some points.

sh0tcall3r avatar May 16 '23 14:05 sh0tcall3r

There's a t5x ckpt here: https://huggingface.co/bigscience/mt0-t5x I don't remember which size that model is though; I don't have the other ones anymore, maybe @adarob does

For 2. & 3., yes I think so

Muennighoff avatar May 18 '23 09:05 Muennighoff

This does appear to be XXL

On Thu, May 18, 2023 at 5:02 AM Niklas Muennighoff @.***> wrote:

There's a t5x ckpt here: https://huggingface.co/bigscience/mt0-t5x I don't remember which size that model is though; I don't have the other ones anymore, maybe @adarob https://github.com/adarob does

For 2. & 3., yes I think so

— Reply to this email directly, view it on GitHub https://github.com/bigscience-workshop/xmtf/issues/19#issuecomment-1552755263, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIJV2APKVDRBUQIFQ6JKQ3XGXQQZANCNFSM6AAAAAAYCE4Z34 . You are receiving this because you were mentioned.Message ID: @.***>

adarob avatar May 18 '23 14:05 adarob

Thanks a lot, guys!

sh0tcall3r avatar May 22 '23 09:05 sh0tcall3r