Prompt Compression
Summary
Will be nice to see some prompt compression to reduce the amount of tokens used looking better performance.
Motivation
Cost saving and looking for better performance as far the compression keeps the essentials with loosing information or context.
Technical Design
More here: https://llmlingua.com/llmlingua2.html
So I can see this as a configurable feature that compress the user prompt.
Alternatives to Consider
Additional context
an example:
User
can you create a scratch and win game in a webpage?
After compression
create scratch win game webpage?
We're doing compression in the MonologueAgent already. This is mostly up to each Agent to implement.
I think we'll want to put a "best practices" doc together for Agent design at some point--will keep this one in mind!
@rbren I saw summarization Agent, wich is kind of different, maybe I am wrong, Happy to help to build that!
Similar idea, but I suppose a bit different 😄
Would love to see a new EfficientAgent that minimizes tokens!