terminal icon indicating copy to clipboard operation
terminal copied to clipboard

Feature Request: 🛠 Support for local LLMs tools in Terminal Chat like Ollama

Open Samk13 opened this issue 2 years ago • 2 comments

Description of the new feature/enhancement

The Windows Terminal Chat currently only supports Azure OpenAI Service. This restriction limits developers who work with or are developing their own local Large Language Models (LLMs), or using tools such as Ollama and need to interface with them directly within the Terminal. The ability to connect to a local LLM service would allow for better flexibility, especially for those concerned with privacy, working offline, or dealing with sensitive information that cannot be sent to cloud services.

Proposed technical implementation details (optional)

include functionality to support local LLM services by allowing users to configure a connection to local AI models. This would involve:

  1. Provide an option in the Terminal Chat settings to specify the endpoint of a local LLM service.
  2. Allowing the user to set the port that the local LLM service should listen to for incoming requests.

Thanks!

Samk13 avatar Dec 14 '23 00:12 Samk13

Would love to see this feature. Phi models would be great for this.

dossjjx avatar May 28 '24 05:05 dossjjx

As a workaround, I setup https://github.com/g0t4/term-chat-ollama as an intermediate "proxy" that can forward requests to any OpenAI compat completions backend... i.e. ollama, OpenAI, groq.com, etc

FYI, video overview here: https://youtu.be/-QcSRmrsND0

@dossjj with this, you can use phi3 by setting the endpoint to https://fake.openai.azure.com:5000/answer?model=phi3

g0t4 avatar Jun 17 '24 06:06 g0t4

Please just implement this as an OpenAI Compatiable interface.

They simply require these 3 peices of information:

OPENAPI_BASE_URL="http://some-endpoint/api" OPENAPI_KEY="12345" OPENAPI_MODEL="gpt-4o"

schaveyt avatar Nov 01 '24 21:11 schaveyt

Just wanted to add my $.02.

I decided to try a POC and manually updated the hard-coded OpenAI URL and Model here: src/cascadia/QueryExtension/OpenAILLMProvider.cpp to my local ollama OpenAI endpoint and a model I already had, and did a build. I was able to use the chat feature perfectly with Ollama as the back-end. To flesh out the request in a bit more detail...

  1. Make the openAIEndpoint user-configurable - at least as a base URL (e.g. https://ollama.mydoman.com or http://localhost:11434). Up to you whether or not you want to hard-code the /v1 part of the full URI, or have the user include it in the input.
  • Personally, I'd probably lean towards the hard-coding of that bit as most 3rd-party LLM apps that have an OpenAI-compliant API also include the /v1 in their custom APIs
  1. Make the OpenAI API token an optional field. It's not required for Ollama and many others, though if they don't have API authorization configured, they tend to just ignore the authorization header entirely.

  2. Instead of hard-coding a specific model, use the OpenAI List Models API endpoint to populate what models are available, then let the user select the model from a drop-down menu. The endpoint would be a GET request to OPENAI_BASE_URL/v1/models (e.g. https://api.openai.com/v1/models)

  • In addition to being a nice element to have customizable, the actual OpenAI API endpoint for this request requires a valid authentication header, so it's an easy way to validate the API key at the same time. Ref: https://platform.openai.com/docs/api-reference/models/list

@zadjii-msft - Hope this helps!

PS: Implementation as-described for part 3 above would also resolve https://github.com/microsoft/terminal/issues/18200

SamAcctX avatar Mar 05 '25 22:03 SamAcctX