Add support for OpenAI API
Would be great if we can configure to add an OpenAI API.
Ollama recently added support for it. https://ollama.com/blog/openai-compatibility EdgeAI supports OpenAI API. https://edgen.co
Thanks for the request and for checking out freechat. Can you elaborate a bit on your use-case? You want to use freechat as an interface for ollama or chatgpt?
It wouldn't matter if Ollama or ChatGPT is supported since both internally uses the OpenAI API. Supporting one will support both. https://platform.openai.com/docs/api-reference/chat/create
Currently when using "Add or Remove Models" and clicking "+" it opens a file picker to select a model. I'm hoping there is a way to "Add Open AI Models" and configure appropriate settings.
- Configure base url which can default to
https://api.openai.comfor OpenAI and if I want to use Ollama I can set the base url tohttp://localhost:11434. - Configure
OPENAI_API_KEYwhich will be set as a header. Ollama doesn't require token but can be set to arbitrary value as some open api sdks may validate this. - Configure model such as
gpt-3.5-turbo
Here is an example calling Ollama using openai compatible api.
export OPENAI_API_URL=http://localhost:11434/v1
export OPENAI_API_KEY=ollama
curl $OPENAI_API_URL/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "llama2",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The LA Dodgers won in 2020."},
{"role": "user", "content": "Where was it played?"}
]
}'
Response:
{
"id": "chatcmpl-66",
"object": "chat.completion",
"created": 1707970399,
"model": "llama2",
"system_fingerprint": "fp_ollama",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The 2020 World Series was played at various locations, including the home stadiums of the competing teams. The Los Angeles Dodgers played their home games at Dodger Stadium in Los Angeles, California, and the Tampa Bay Rays played their home games at Tropicana Field in St. Petersburg, Florida."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 0,
"completion_tokens": 69,
"total_tokens": 69
}
}
If you want to call OpenAI API.
export OPENAI_API_URL=https://api.openai.com/v1
export OPENAI_API_KEY=....
# change the "model" to "gpt-3.5-turbo"
As for the use case everyone in my family might not have a powerful machine to run the models. I would like to run an Ollama server or other openai compatible servers in a different powerful server so that any machine in my home network can reuse the server.
Seems like free-chat is already running llama.cpp/examples/server but is using /completion api. I suggest migrating to /v1/chat/completion endpoint mentioned in their readme instead which is already supported by llama.cpp, that way local models, Ollama will all use the OpenAI API.
OK got it, thanks for the detailed writeup!
I support adding that functionality and I think it would make sense to extend @shavit's work here with the "remote model" option https://github.com/psugihara/FreeChat/pull/50
There is one blocker I see currently. I agree that we only want to program against 1 API (right not that is /completion) but AFAIK the /v1/chat/completion endpoint only supports 2 prompt templates compared to the 5 we currently support.
There is some discussion of how to support different templates here but they have not reached consensus.
Looks like templating is coming along. Let's update this issue as support becomes available from llama.cpp https://github.com/ggerganov/llama.cpp/pull/5538
I can work on it after their merge.