sparseml
sparseml copied to clipboard
[OneShot][Testing] Expand Integration tests to run for llama-7b; add gpu/auto cases
Summary
- Update obcq tests into separate integration tests
- Add/Update
test_sparsities --> test_obcq_sparsity.py; now tests LLama-7b using gpu and "auto" using the same sparsity recipe as TInyStories - Add/Update the original
test_obcq_tinystories --> test_obcq_completion.py; tests quantization, quantization + sparsity for TinyStories. Tests quantization and quantization + pruning with Llama-7b gpu - All GPU tests set-up to run on a nightly cadence
- All other Integration tests are separated into their own files; still need to update/add configs for them for the different cases we'd like to cover
Remaining Question
- For the new
test_obcq_compeltion.py(which was previouslytest_obcq_tinystories) we are just running one-shot without any metric calculation on the results; is there anything we can add? Perplexity seems to be very long running, even for TinyStories - How should the remaining integration tests which were not updated to be tested for llama, be updated in terms of use cases?