transformers icon indicating copy to clipboard operation
transformers copied to clipboard

bos_token and eos_token for Llama tokenizer

Open yujianll opened this issue 2 years ago • 3 comments

System Info

  • transformers version: 4.28.0.dev0
  • Platform: Linux-5.15.0-58-generic-x86_64-with-glibc2.17
  • Python version: 3.8.15
  • Huggingface_hub version: 0.11.0
  • PyTorch version (GPU?): 1.13.0+cu117 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed

Who can help?

@ArthurZucker @zphan

Information

  • [ ] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

model = AutoModelForCausalLM.from_pretrained("./llama-7b-hf")
tokenizer = AutoTokenizer.from_pretrained("./llama-7b-hf", use_fast=False)

model.config.eos_token_id shows 1, but tokenizer.eos_token_id shows 2.

Expected behavior

I wonder if they should be the same, or am I missing something?

yujianll avatar Mar 18 '23 02:03 yujianll

Hey! There must be a typo in your generation_config as the convert_llama_weights_to_hf.py as well as configuration_llama both set it to 2. Are you sure that you are using the latest scripts? The fix is just model.config.eos_token_id = 2 in this case.

ArthurZucker avatar Mar 20 '23 13:03 ArthurZucker

I see. The config.json and generation_config.json both set it to 1. So I will change it to 2 for now.

yujianll avatar Mar 20 '23 15:03 yujianll

Thank you 

Sent from Yahoo Mail for iPhone

On Monday, March 20, 2023, 11:16 AM, Yujian Liu @.***> wrote:

I see. The config.json and generation_config.json both set it to 1. So I will change it to 2 for now.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Jagjeffery avatar Mar 20 '23 15:03 Jagjeffery

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 17 '23 15:04 github-actions[bot]