Add usage summary for agents
Why are these changes needed?
-
Agent.print_usage_summary()will print the cost summary for the agent. -
Agent.get_actual_usage()andAgent.get_total_usage()will return the usage summary in a dict. When an agent doesn't use LLM, they will return None. -
Agent.reset()will reset the usage summary. -
autogen.agent_utils.gather_usage_summarywill gather the usage summary for a list of agents.
Related issue number
Closes #1070
Checks
- [ ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
- [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR.
- [ ] I've made sure all auto checks have passed.
Codecov Report
Attention: 6 lines in your changes are missing coverage. Please review.
Comparison is base (
563b1bb) 32.08% compared to head (5c141ca) 66.57%.
| Files | Patch % | Lines |
|---|---|---|
| autogen/agentchat/conversable_agent.py | 60.00% | 3 Missing and 3 partials :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #1269 +/- ##
===========================================
+ Coverage 32.08% 66.57% +34.48%
===========================================
Files 32 33 +1
Lines 4394 4431 +37
Branches 1025 1090 +65
===========================================
+ Hits 1410 2950 +1540
+ Misses 2867 1178 -1689
- Partials 117 303 +186
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 66.50% <83.78%> (+34.46%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I think this PR is fine as it is. Though I would consider moving toward a design pattern of minimizing intermediate states. E.g., accumulator for costs and tokens, as those can be computed directly if you have the complete history of OpenAI responses of this conversation. Keeping intermediate states and throwing away the original data can lead to many hard to trace bug as you don't have the original data to verify. Also, having the original data can enable many diagnostic and analytic features in the future.
I would consider looking at this PR #1222 as an example (or a starting point), and consider create a diagnostic context to enable features like cost estimation, LLM API latency stats, LLM API usage logging, etc.
diagnostic.start()
a.initiate_chat(b)
diagnostic.usage()
diagnostic.end()
diagnostic.summary()
Something similar.
I am fixing the test error in #1274
Wait , how does this work for agents like GroupChatManager or agents that are enabled as Teachable? These agents make additional LLM calls outside the usual flow.
I'm fixing the test failure in #1284
Wait , how does this work for agents like GroupChatManager or agents that are enabled as Teachable? These agents make additional LLM calls outside the usual flow.
Thanks for the comment!
GroupChatManager inherits Conversable_agent, so the print_usage_summary and gather_usage_summary works for it.
But you are right for agents that doesn't follow a usual flow. If there are additional flow defined inside an agent, the user need to take them out explicitly to pass to gather_usage_summary. This still doesn't work if there are additional client defined. For example, Compressible_Agent has a self.client and a self.compress_client, where this method doesn't count for self.compress_client.
Based on these observations, I am considering the following changes for this PR:
- remove the
Agent.print_usage_summary(),Agent.get_actual_usage()andAgent.get_total_usage(), as they may not reflect real usage by the agent. - only add a
gather_usage_summary_from_clientsutility function. We explicitly ask users to collect all the definedOpenAIWrapperinstance from agents
In the next step, maybe we can provide an abstract class method in conversable_agent for users to register additional usage. (and for our existing Contrib Agents), then we can add back Agent.print_usage_summary(), Agent.get_actual_usage() and Agent.get_total_usage(), and usage collection for agents.
@kevin666aa we can also merge this PR and make the proposed change in the next PR (create an issue first).