Add support for Azure, Palm, Anthropic, Cohere, Hugging Face Llama2 70b Models - using litellm
This PR adds support for models from all the above mentioned providers using https://github.com/BerriAI/litellm/
All LLM API Models are guaranteed to have the same Input/Output interface
Here's a sample of how it's used:
from litellm import completion
## set ENV variables
# ENV variables can be set in .env file, too. Example in .env.example
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"
os.environ["HF_API_TOKEN"] = "hf-key"
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# hugging face llama2 call
completion(model="meta-llama/llama-2-7b-hf", messages=messages)
# hugging face llama2 Guanaco call
completion(model="TheBloke/llama-2-70b-Guanaco-QLoRA-fp16", messages=messages)
# cohere call
response = completion("command-nightly", messages)
# anthropic call
response = completion(model="claude-instant-1", messages=messages)
PR checklist:
- [x] Tested by creator on localhost:8000/docs
- [ ] Tested by creator on refinery
- [ ] Tested by reviewer on localhost:8000/docs
- [ ] Tested by reviewer on refinery
- [ ] (If added) common code tested in notebook/ script
- [ ] Conforms with the agreed upon code pattern
@LeonardPuettmann @SvenjaKern can you please take a look at this PR ?
Will add to .md files + tests if this initial commit looks good
We're rolling out support for the top chatLLMs on Hugging Face - are there any you'd like me to add support / examples for here ?
@ishaan-jaff Super awesome! Thank you for your contribution. 👍 Really like this! A few questions: How would I configure things like the temperature for the GPT model? Can I also do that with the litellm package? And are the Llama models hosted on HuggingFace?
Code looks great, but I would suggest that this deserves its own brick module. Something like a "general_llm_brick" or "lite_llm_brick"?
@ishaan-jaff Just following up to see if you are still interested in implementing this. :)
Hi @LeonardPuettmannKern yes
How would I configure things like the temperature for the GPT model
It's one of the input params - exactly like OpenAI Chat Completion
And are the Llama models hosted on HuggingFace?
You can use llama from any of the providers we support - Sagemaker, togetherAI, replicate, Deep infra etc
How do I create a brick ?