vividfog comments

Results 13 comments of


                                            vividfog

Bring back the EMBED feature in the Modelfile

Nice, this is an excellent feature done well. Thank you to all contributors.

OpenAI API compatibility

I'm surprised [LiteLLM](https://github.com/BerriAI/litellm) hasn't been mentioned in the thread yet. Found it from the [README.md](https://github.com/jmorganca/ollama#community-integrations) of Ollama repo today. "Call LLM APIs using the OpenAI format", 100+ of them, including...

OpenAI API compatibility

My initial advice was not complete I learned today. Continue.dev sends two parallel queries, one for the user task and another to summarize the conversation. And LiteLLM logs may show...

Ollama serve fails silently when an input is too long

This variable and many others are settings per model. Not per server. And they must be per model because every model needs a different setup. When the server starts, it...

Ollama serve fails silently when an input is too long

@logancyang I see. Sorry about the pun, couldn't resist when it came to mind. Failing silently when the input goes past some threshold, I agree that's not optimal. I'll have...

Better first use UX

The UX was fine. Some notes: 1. Asking for OpenAI key when I came with the idea that I'll use a local model for this. Perhaps "press enter to not...

Better support for different types of responses and interaction

I asked _?? write a simple template for a docker build file_ and found out that indeed line-by-line is a different use case. As for explanation, perhaps this would help...

Add option for RAG-style augmentation

Take a look at how continue.dev comes with its own little embeddings server. It's a new experimental feature, and it's a big piece of software. https://continue.dev/docs/walkthroughs/codebase-embeddings Anyway, a minimal version...

Add option for RAG-style augmentation

I don't have a strong candidate to offer. My intuition says to stay away from abstract verbs like "embed", they are better suited for llm, which is a lower level...

rich printing

I see it's done with the .live() option, with streaming. Very nice. Previously it was possible to pipe the output to Rich to get Markdown rendering, but then streaming is...