support loading model without config.json file

Open itazap opened this issue 1 year ago • 1 comments

We already have support for loading a fast tokenizer with a tokenizer.model file only. However, we still require config.json to exist in the model folder (on hub or locally), even though it is not required to load from the model file. This feature will remove this dependency on config.json, allowing users to load from only a tokenizer.model file.

TESTS:

updated stale tests
updated existing test to work without config file
added test in llama to load with only tokenizer.model file

Reviewer: @ArthurZucker

Jul 31 '24 14:07 itazap

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Jul 31 '24 14:07 HuggingFaceDocBuilderDev