lighteval
lighteval copied to clipboard
[FT] Caching
See slack
Cache each task which finished to 1) avoid re-running it 2) restart evals which failed 3) avoid re-calling model as a judge when re-running
Hi~ really need this feature! One question: When I use the same tasks to evaluate the models(with same architecture but are from different runs), will the evaluation sample (docs and requests) be cached?
Good to know there's interest! We have not decided yet, but it's likely we'll have a check on the model commit to make sure we only cache relevant samples for each model
That's Cool! thanks for your explanation!