mlx-swift-examples icon indicating copy to clipboard operation
mlx-swift-examples copied to clipboard

chat / KVCache requires re-prepare of media

Open davidkoski opened this issue 1 year ago • 3 comments

See also #277

Although KVCache is useful to avoid recomputing state, the full input to the VLM has to be rebuilt each time -- this includes preparing the images and video. It would be nice if we could encapsulate that state somehow.

davidkoski avatar Apr 23 '25 20:04 davidkoski

I'm writing an LLM server and I want to implement a prompt cache, but I'm having trouble with the Sendable boundary and KVCache / TokenIterator, I'm wondering if you have any suggestions?

I'm thinking ideally the KVCache needs to be stored in the ModelContainer actor or ModelContext?

jolonf avatar May 02 '25 09:05 jolonf

@jolonf please make a new issue for this -- we can discuss.

davidkoski avatar May 02 '25 15:05 davidkoski

I've created an issue here: https://github.com/ml-explore/mlx-swift-examples/issues/310

jolonf avatar May 03 '25 03:05 jolonf