Enable Response Caching to reduce costs

Open gbecerra1982 opened this issue 2 years ago • 2 comments

As part of the roadmap there is a request to include in the solution to reduce the numbers of requests to OpenAI to enable Caching.

We are currently working on it.

Jul 14 '23 14:07 gbecerra1982

Hello, has this been addressed already?

Apr 16 '24 17:04 jcarriondiaz

references: https://www.linkedin.com/feed/update/urn:li:activity:7177084885977718785/

https://stochasticcoder.com/2024/03/22/improve-llm-performance-using-semantic-cache-with-cosmos-db/

Additional research links:

https://github.com/microsoft/kernel-memory

Jan 21 '25 16:01 placerda