garak
garak copied to clipboard
detect if tokenizer is not loaded and adjust
In some cases a pretrained model pipeline may not specify the tokenizer in the stored config. If missing attempt to get tokenizer by model name. An example of this as time of revision is nvidia/Minitron-4B-Base.
Verification
List the steps needed to make sure this thing works
- [ ] Tests against a model where pipeline is known to not have a saved tokenizer:
python -m garak -m huggingface -n nvidia/Minitron-4B-Base --probes dan.Dan_11_0
- [ ] Run the tests and ensure they pass
python -m pytest tests/ - [ ] Verify the generator returned inference responses