[Issue]: Best way to call llama3.1 with function calling from a cloud provider?
Describe the issue
I'd like to use llama3.1 70b or 405b with function calling. Any recommendations for what is supported in autogen?
I have tried but could not get either of these to work:
-
Google Cloud's Vertex AI solution to call llama3.1 (using autogen's Vertex AI supported code following https://microsoft.github.io/autogen/docs/topics/non-openai-models/cloud-gemini_vertexai worked for Gemini models but not llama, because vertexai is still trying to call
publishers/google/models/llama3_1when it should bepublishers/llama/models/llama3_1, issue at https://github.com/microsoft/autogen/blob/main/autogen/oai/gemini.py#L202-L207) -
Google Cloud's Vertex AI API solution (cloud-based proxy server)
-
Ollama llama 3.1 70b hosted locally
Steps to reproduce
No response
Screenshots and logs
No response
Additional Information
No response
Hey @opqpop, I'd suggest Together.AI as it supports Llama 3.1 (all sizes): https://docs.together.ai/docs/chat-models
We have a Together.AI client class: https://microsoft.github.io/autogen/docs/topics/non-openai-models/cloud-togetherai/
I have noticed that they may have specific function calling techniques for Llama3.1 which the AutoGen client class doesn't yet support, but it does support function calling so it may still work until that change is made: https://docs.together.ai/docs/llama-3-function-calling