dbrx Missing tokenizer when use vllm

  File "/home/paas/vllm/vllm/engine/llm_engine.py", line 222, in _init_tokenizer
    self.tokenizer: BaseTokenizerGroup = get_tokenizer_group(
  File "/home/paas/vllm/vllm/transformers_utils/tokenizer_group/__init__.py", line 20, in get_tokenizer_group
    return TokenizerGroup(**init_kwargs)
  File "/home/paas/vllm/vllm/transformers_utils/tokenizer_group/tokenizer_group.py", line 23, in __init__
    self.tokenizer = get_tokenizer(self.tokenizer_id, **tokenizer_config)
  File "/home/paas/vllm/vllm/transformers_utils/tokenizer.py", line 66, in get_tokenizer
    tokenizer = AutoTokenizer.from_pretrained(
  File "/home/paas/miniconda3/envs/naie/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 822, in from_pretrained
    return tokenizer_class.from_pretrained(
  File "/home/paas/miniconda3/envs/naie/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2086, in from_pretrained
    return cls._from_pretrained(
  File "/home/paas/miniconda3/envs/naie/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2327, in _from_pretrained
    raise OSError(
OSError: Unable to load vocabulary from file. Please check that the provided vocabulary is accessible and not corrupted.

Mar 29 '24 08:03 matrixssy

Thanks @matrixssy, could you provide more details about how you are calling vllm? cc: @megha95

Mar 30 '24 15:03 hanlint

@matrixssy What version of vllm are you using? DBRX uses tiktoken as the tokenizer. If you installed v0.4.0+ (or the latest main branch), this tokenizer would have also been installed. See here.

Mar 30 '24 22:03 megha95

@matrixssy What version of vllm are you using? DBRX uses tiktoken as the tokenizer. If you installed v0.4.0+ (or the latest main branch), this tokenizer would have also been installed. See here.

Hi, actually I pulled the main branch and installed it using pip install -e .. And I have checked that it includes the newly merged about DBRX.

Apr 02 '24 02:04 matrixssy

@matrixssy What version of vllm are you using? DBRX uses tiktoken as the tokenizer. If you installed v0.4.0+ (or the latest main branch), this tokenizer would have also been installed. See here.

Like this:

python -m utils.vllm_server \
  --model /home/paas/models/dbrx-instruct \
  --tensor-parallel-size 8 \
  --max-model-len 8192 \
  --block-size 32 \
  --disable-log-stats \
  --disable-log-requests \
  --trust-remote-code \
  --dtype float16 \
  --gpu-memory-utilization 0.9

In addition, I obtained the tokenizer from https://huggingface.co/Xenova/dbrx-instruct-tokenizer/tree/main, but I'm unsure if this is the correct or intended source.

Apr 02 '24 03:04 matrixssy