garak icon indicating copy to clipboard operation
garak copied to clipboard

detect if tokenizer is not loaded and adjust

Open jmartin-tech opened this issue 1 year ago • 0 comments

In some cases a pretrained model pipeline may not specify the tokenizer in the stored config. If missing attempt to get tokenizer by model name. An example of this as time of revision is nvidia/Minitron-4B-Base.

Verification

List the steps needed to make sure this thing works

  • [ ] Tests against a model where pipeline is known to not have a saved tokenizer:
python -m garak -m huggingface -n nvidia/Minitron-4B-Base --probes dan.Dan_11_0
  • [ ] Run the tests and ensure they pass python -m pytest tests/
  • [ ] Verify the generator returned inference responses

jmartin-tech avatar Nov 18 '24 23:11 jmartin-tech