[Issue]: Best way to call llama3.1 with function calling from a cloud provider?

Open opqpop opened this issue 1 year ago • 1 comments

Describe the issue

I'd like to use llama3.1 70b or 405b with function calling. Any recommendations for what is supported in autogen?

I have tried but could not get either of these to work:

Google Cloud's Vertex AI solution to call llama3.1 (using autogen's Vertex AI supported code following https://microsoft.github.io/autogen/docs/topics/non-openai-models/cloud-gemini_vertexai worked for Gemini models but not llama, because vertexai is still trying to call publishers/google/models/llama3_1 when it should be publishers/llama/models/llama3_1, issue at https://github.com/microsoft/autogen/blob/main/autogen/oai/gemini.py#L202-L207)
Google Cloud's Vertex AI API solution (cloud-based proxy server)
Ollama llama 3.1 70b hosted locally

Steps to reproduce

No response

Screenshots and logs

No response

Additional Information

No response

Jul 31 '24 01:07 opqpop

Hey @opqpop, I'd suggest Together.AI as it supports Llama 3.1 (all sizes): https://docs.together.ai/docs/chat-models

We have a Together.AI client class: https://microsoft.github.io/autogen/docs/topics/non-openai-models/cloud-togetherai/

I have noticed that they may have specific function calling techniques for Llama3.1 which the AutoGen client class doesn't yet support, but it does support function calling so it may still work until that change is made: https://docs.together.ai/docs/llama-3-function-calling

Aug 01 '24 04:08 marklysze