[BUG]: Chat第三步的tokenizer只有一个,如果actor和critic是两个模型呢?
Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑🤝🧑👫🧑🏿🤝🧑🏻👩🏾🤝👨🏿👬🏿
Title: [BUG]: There is only one tokenizer in the third step of Chat. What if actor and critic are two models?
hi @iMountTai The two models can be different as long as the actor is same as the initial model (the one trained in SFT stage), and the critic is same as the reward model (the one trained in stage 2). They can use different tokenizers. We are preparing for revised version for stage 2&3 and keep being updated!
hi @iMountTai The two models can be different as long as the actor is same as the initial model (the one trained in SFT stage), and the critic is same as the reward model (the one trained in stage 2). They can use different tokenizers. We are preparing for revised version for stage 2&3 and keep being updated!
Which means, just change the tokenizer in lines 130 and 140 in train_prompts.py?
https://github.com/hpcaitech/ColossalAI/blob/d20dceb9a3d1bdcb2376201220f49fca7c7c1be9/applications/Chat/examples/train_prompts.py#L130 https://github.com/hpcaitech/ColossalAI/blob/d20dceb9a3d1bdcb2376201220f49fca7c7c1be9/applications/Chat/examples/train_prompts.py#L140C1-L140C1
Still unsolved?
I think so.