Missing tokenizer when use vllm
File "/home/paas/vllm/vllm/engine/llm_engine.py", line 222, in _init_tokenizer
self.tokenizer: BaseTokenizerGroup = get_tokenizer_group(
File "/home/paas/vllm/vllm/transformers_utils/tokenizer_group/__init__.py", line 20, in get_tokenizer_group
return TokenizerGroup(**init_kwargs)
File "/home/paas/vllm/vllm/transformers_utils/tokenizer_group/tokenizer_group.py", line 23, in __init__
self.tokenizer = get_tokenizer(self.tokenizer_id, **tokenizer_config)
File "/home/paas/vllm/vllm/transformers_utils/tokenizer.py", line 66, in get_tokenizer
tokenizer = AutoTokenizer.from_pretrained(
File "/home/paas/miniconda3/envs/naie/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 822, in from_pretrained
return tokenizer_class.from_pretrained(
File "/home/paas/miniconda3/envs/naie/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2086, in from_pretrained
return cls._from_pretrained(
File "/home/paas/miniconda3/envs/naie/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2327, in _from_pretrained
raise OSError(
OSError: Unable to load vocabulary from file. Please check that the provided vocabulary is accessible and not corrupted.
Thanks @matrixssy, could you provide more details about how you are calling vllm? cc: @megha95
@matrixssy What version of vllm are you using? DBRX uses tiktoken as the tokenizer. If you installed v0.4.0+ (or the latest main branch), this tokenizer would have also been installed. See here.
@matrixssy What version of vllm are you using? DBRX uses tiktoken as the tokenizer. If you installed v0.4.0+ (or the latest
mainbranch), this tokenizer would have also been installed. See here.
Hi, actually I pulled the main branch and installed it using pip install -e .. And I have checked that it includes the newly merged about DBRX.
@matrixssy What version of vllm are you using? DBRX uses tiktoken as the tokenizer. If you installed v0.4.0+ (or the latest
mainbranch), this tokenizer would have also been installed. See here.
Like this:
python -m utils.vllm_server \
--model /home/paas/models/dbrx-instruct \
--tensor-parallel-size 8 \
--max-model-len 8192 \
--block-size 32 \
--disable-log-stats \
--disable-log-requests \
--trust-remote-code \
--dtype float16 \
--gpu-memory-utilization 0.9
In addition, I obtained the tokenizer from https://huggingface.co/Xenova/dbrx-instruct-tokenizer/tree/main, but I'm unsure if this is the correct or intended source.