lyt719
lyt719
> 你好,现在还不可以用hf的trainer。想实现的话需要修改一下hf的trainer训练循环中的backward和step,换成adalomo里的fused_backward。 我想再请教您一下,adalomo集成到hf后可以支持其他大模型吗?
> > 你好,现在还不可以用hf的trainer。想实现的话需要修改一下hf的trainer训练循环中的backward和step,换成adalomo里的fused_backward。 > > @lyt719 I've raised an issue to ask for the intergration here [huggingface/transformers#29649](https://github.com/huggingface/transformers/issues/29649) 但还不能给出集成进去的时间 非常感谢!
> Hey @lyt719 - could you share reproducible launch scripts you used for fine-tuning? It would also be great to have a snapshot of the checkpoint dir to confirm that...

> @chunping-xt It might be helpful to refer to the add_special_tokens() function from the following link: https://huggingface.co/docs/transformers/main_classes/tokenizer May I ask where your Korean vocab comes from? Do you add all...
> @lyt719, I want to finetune with a language other than Korean. The problem is that when changing the vocab size, the model need to resize it, and I don't...
> @lyt719 not yet, it's my dream and waiting for someone to care and implement it. The only thing I did was use the existing flan-t5 tokens to represent my...
> > 运行1.5B,3B没问题,直接llamafactory-cli chat运行7B,8B的话,可以进入对话,但在user处输入内容,assistant不回答信息。微调后运行7B,8B在进入对话前就报如上错误 > > 您好,我也遇到类似的情况,请问您解决了吗。我使用的是deepseek_r1_distill_qwen_7B,没有微调,然后可以进入对话,但在user处输入内容,assistant不回答信息,直接报错 Floating point exception (core dump) 我也是 Floating point exception (core dump),请问你弄好了吗