autogen Add usage summary for agents

Why are these changes needed?

Agent.print_usage_summary() will print the cost summary for the agent.
Agent.get_actual_usage() and Agent.get_total_usage() will return the usage summary in a dict. When an agent doesn't use LLM, they will return None.
Agent.reset() will reset the usage summary.
autogen.agent_utils.gather_usage_summary will gather the usage summary for a list of agents.

Related issue number

Closes #1070

Checks

[ ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
[ ] I've added tests (if relevant) corresponding to the changes introduced in this PR.
[ ] I've made sure all auto checks have passed.

Jan 15 '24 20:01 yiranwu0

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (563b1bb) 32.08% compared to head (5c141ca) 66.57%.

Files	Patch %	Lines
autogen/agentchat/conversable_agent.py	60.00%	3 Missing and 3 partials :warning:

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1269       +/-   ##
===========================================
+ Coverage   32.08%   66.57%   +34.48%     
===========================================
  Files          32       33        +1     
  Lines        4394     4431       +37     
  Branches     1025     1090       +65     
===========================================
+ Hits         1410     2950     +1540     
+ Misses       2867     1178     -1689     
- Partials      117      303      +186

Flag	Coverage Δ
unittests	`66.50% <83.78%> (+34.46%)`	:arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Jan 15 '24 20:01 codecov-commenter

I think this PR is fine as it is. Though I would consider moving toward a design pattern of minimizing intermediate states. E.g., accumulator for costs and tokens, as those can be computed directly if you have the complete history of OpenAI responses of this conversation. Keeping intermediate states and throwing away the original data can lead to many hard to trace bug as you don't have the original data to verify. Also, having the original data can enable many diagnostic and analytic features in the future.

I would consider looking at this PR #1222 as an example (or a starting point), and consider create a diagnostic context to enable features like cost estimation, LLM API latency stats, LLM API usage logging, etc.


diagnostic.start()
 
a.initiate_chat(b)

diagnostic.usage()

diagnostic.end()

diagnostic.summary()

Something similar.

Jan 15 '24 22:01 ekzhu

I am fixing the test error in #1274

Jan 16 '24 03:01 sonichi

Wait , how does this work for agents like GroupChatManager or agents that are enabled as Teachable? These agents make additional LLM calls outside the usual flow.

Jan 16 '24 16:01 afourney

I'm fixing the test failure in #1284

Jan 16 '24 17:01 sonichi

Wait , how does this work for agents like GroupChatManager or agents that are enabled as Teachable? These agents make additional LLM calls outside the usual flow.

Thanks for the comment! GroupChatManager inherits Conversable_agent, so the print_usage_summary and gather_usage_summary works for it. But you are right for agents that doesn't follow a usual flow. If there are additional flow defined inside an agent, the user need to take them out explicitly to pass to gather_usage_summary. This still doesn't work if there are additional client defined. For example, Compressible_Agent has a self.client and a self.compress_client, where this method doesn't count for self.compress_client.

Based on these observations, I am considering the following changes for this PR:

remove the Agent.print_usage_summary(), Agent.get_actual_usage() and Agent.get_total_usage(), as they may not reflect real usage by the agent.
only add a gather_usage_summary_from_clients utility function. We explicitly ask users to collect all the defined OpenAIWrapper instance from agents

In the next step, maybe we can provide an abstract class method in conversable_agent for users to register additional usage. (and for our existing Contrib Agents), then we can add back Agent.print_usage_summary(), Agent.get_actual_usage() and Agent.get_total_usage(), and usage collection for agents.

Jan 16 '24 17:01 yiranwu0

@kevin666aa we can also merge this PR and make the proposed change in the next PR (create an issue first).

Jan 16 '24 19:01 sonichi