agentscope
agentscope copied to clipboard
Make the dictionary parser much more flexible, and decouple task-specific information from DictDialogAgent
Description
Motivation
- Support to change the required fields in returned dictionary dynamically. For example, the same agent should respond a dictionary with "agreement" field in a discussion, but respond with "vote" field instead in voting process.
-
Decouple
DictDialogAgentfrom special fields in returned dictionary. CurrentDictDialogAgentdefaults the generated dictionary from LLM must has a"speak"field, which is task-specific. - Allow to filter generated dictionary when storing into memory, return to other agents and control the application workflow. For example, in response dictionary, some fields are (not) to be stored into memory, some fields should (not) be returned to other agents, and some fields are used to control the application workflows:
fake_parsed_response = {
"thought": "xxx",
"speak": "xxx",
"agreement": true/false,
}
self.speak(fake_parsed_response["speak"]) # only speak field
self.memory.add(fake_parsed_response) # all fields
return Msg(
self.name,
content=fake_parsed_response["speak"], # only speak field in content
role="assistant",
metadata=fake_parsed_response["agreement"] # only agreement field
)
Design
-
DictFilterMixinclass: For parsers that return dictionary, we add a parent classDictFilterMixin, which haskeys_to_speak/memory/returnattributes, andto_memory,to_speakandto_returnfunctions to filter the given dictionary. -
In
DictDialogAgent, a parser takes responsibility for- generate format instruction ("You should respond in the following format ...")
- parse LLM response into a dictionary
- filter the parsed dictionary in
self.speak,self.memory.addandreturninterface
The DictDialogAgent works as follows:
class DictDialogAgent(AgentBase):
def __init__(self):
# ...
self.parser = None
def reply(self):
prompt = self.model.format(
self.memory.get_memory(),
self.parser.format_instruction
)
res = self.model(prompt, parse_func=self.parser.parse)
self.memory.add(Msg(self.name, self.parser.to_memory(res.parsed), "assistant"))
msg = Msg(
self.name,
content=self.parser.to_content(res.parsed),
role="assistant",
metadata=self.parser.to_metadata(res.parsed)
)
self.speak(msg)
return msg
In this way, when an agent needs to return different fields, developers only need to change its parsers as follows
agent = DictDialogAgent("assistant", "gpt-4")
# parser for discussion
discussion_parser = MarkdownJsonDictParser(
content_hint={
"speak": "xxx",
"thought": "xxx",
"end_discussion": true/false,
},
keys_to_memory=["speak", "thought"],
keys_to_content="speak",
keys_to_metadata=["end_discussion"]
)
# parser for vote
vote_parser = MarkdownJsonDictParser(
content_hint={
"thought": "xxx",
"vote": "player1 or player2"
},
keys_to_memory=["thought", "vote"],
keys_to_content="vote"
)
# discussion
agent.set_parser(discussion_parser)
while True:
x = agent(x)
if x.metadata["end_discussion"]:
break
# vote
agent.set_parser(vote_parser)
while True:
# vote ...
Checklist
Please check the following items before code is ready to be reviewed.
- [x] Code has passed all tests
- [x] Docstrings have been added/updated in Google Style
- [x] Documentation has been updated
- [x] Code is ready for review
@qbc2016 please check if the DictDialogAgent can handle the werewolf game.