pipecat icon indicating copy to clipboard operation
pipecat copied to clipboard

Function calling

Open chadbailey59 opened this issue 1 year ago • 1 comments

Let's wait to merge until I build a foundational example too.

chadbailey59 avatar May 23 '24 14:05 chadbailey59

So, function calling is... weird.

Without function calling, a chatbot pipeline is pretty straightforward:

  • The user says something
  • Pipecat appends whatever the user said to a messages list and send that list to the LLM
  • The LLM generates an assistant response
  • Pipecat generates TTS from that assistant response and plays that audio through the transport

With function calling, the flow is different:

  • The user says something
  • Pipecat appends whatever the user said to a messages list (along with some possible "tools" it may choose to use) and sends that list to the LLM
  • The LLM generates an assistant response that may be text, or it may be a "tool call", i.e. the LLM decides to use one of the available "tools"
    • If it's text:
      • Pipecat generates TTS from that assistant response and plays that audio through the transport
    • if it's a tool call:
      • Pipecat needs to append the assistant message with the tool call, including its params, to the message list
      • Pipecat then needs to call the requested function with the provided params (e.g. "check_weather" with params {location: 'san francisco'}
      • Pipecat then needs to append the results from that function call to the messages list in a weird format
      • Pipecat then needs to re-prompt the LLM with the new messages list to generate an answer to the user's question
      • Finally, Pipecat generates TTS from that second assistant response and plays that audio through the transport

Right now, that entire second branch is implemented in the 15-function-calling example as a FunctionCaller class that pushes a context frame back up the pipeline for the re-prompting. We should probably be handling all this inside the framework itself, but that starts to touch on how much context management we should be doing on behalf of the user.

chadbailey59 avatar May 24 '24 18:05 chadbailey59

@aconchillo I think I've addressed the concerns and the function calling code is ready to merge. But I'm still concerned I may have inadvertently undone some of your changes through various merges and rebases.

chadbailey59 avatar May 28 '24 17:05 chadbailey59

OK, I think I've gotten the new function calling approach where it needs to be. Let's get this one merged, and I can remove the function call frame types in a follow-up PR. @aconchillo

chadbailey59 avatar May 30 '24 14:05 chadbailey59