Added diskcache to base model.

Open JoelNiklaus opened this issue 1 year ago • 5 comments

Some models are very expensive to run inference on (e.g., Llama-3.3-70B). When we need to rerun inference to add a new metric for example, it would be very time consuming and expensive, especially since at least 4 80GB GPUs are necessary for inference.

We might want to add a flag to enable/disable caching. Also, we might want it for the other methods like loglikelihood generation too.

Dec 30 '24 08:12 JoelNiklaus

Thanks, I don't know when I have the capacity to add it to the other methods.

Dec 30 '24 13:12 JoelNiklaus

This might not be necessary anymore with PR #488.

Jan 07 '25 04:01 JoelNiklaus

Want us to close this one?

Jan 07 '25 07:01 clefourrier

I personally think it would still be nice to have caching here too, but for me it is not strictly necessary anymore I guess.

Jan 07 '25 15:01 JoelNiklaus

To make local inference of large models more robust it would still be useful.

Jan 07 '25 15:01 JoelNiklaus