QuaRot icon indicating copy to clipboard operation
QuaRot copied to clipboard

How is perplexity calculated with the KV cache?

Open tsengalb99 opened this issue 1 year ago • 0 comments

I've noticed QuaRot and other KV cache papers include perplexity, but it is unclear to me how a quantized KV cache is used during perplexity calculation. Do you have a detailed writeup of how you calculate ppl with the quantized kv cache? Thanks

tsengalb99 avatar Sep 24 '24 03:09 tsengalb99