Oscarjia
Oscarjia
%%pixie_debugger show be the first line of the cell. even comment should not at the first cell right code is: ``` --------------------------- %%pixie_debugger def test(): pass test() ----------------------------- ``` wrong...
@RDouglasSharp Do we also set unk_token? ``` tokenizer.unk_token = eot tokenizer.unk_token_id = eot_id ```
@mmaaz60 Thanks for sharing your better solution, that is really great! Besides, i have star your project!
@hongyinjie What change are you referring to? does this can fix llama3 can't stop problem? [https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/discussions/4](url)
I think llama2 also have the same can't stop situation. So for llama2` ` is the stop word? Test on LLAMA2, The prompt: `[INST] hi. Evaluate translation from English #This...
> > I think llama2 also have the same situation. So for llama2` ` is the stop word? Test on LLAMA2, The prompt: `[INST] hiEvaluate translation from English #This section...
> HI, I mean the official stop tokens is indeed ``, but as you can see the conversation class has defined llama2 template with ` ` as stop string, for...
@congchan Could you add the deepspeed zero3 support on the triain_with_template? Do you think if it should add ``` if trainer.is_deepspeed_enabled: trainer.save_model() ``` ``` if trainer.is_deepspeed_enabled: trainer.save_model() else: safe_save_model_for_hf_trainer(trainer=trainer, output_dir=training_args.output_dir)...
@MrZhengXin I think we can add ` add_special_tokens=False # Do not add special tokens`, it can works. ``` def tokenize_conversations(conversations, tokenizer): input_ids = tokenizer( conversations, return_tensors="pt", padding="max_length", max_length=tokenizer.model_max_length, truncation=True, add_special_tokens=False...
> @MrZhengXin I think we can add ` add_special_tokens=False # Do not add special tokens`, it can works. > > ``` > def tokenize_conversations(conversations, tokenizer): > input_ids = tokenizer( >...