Chris
Chris
@iSevenDays I've added some comments; I hit breaking issues so I couldn't test further, so we need to resolve this before we can continue further. I might have a recommendation:...
Gotcha. Alright, if it's necessary then there's no helping it. Mostly wanted to save you the headache of dealing with that code, as what the front ends expect is not...
Awesome! I'll dig through this week. I have a rough idea of the direction you were going with on the previous changes in the PR, and if it will eventually...
@iSevenDays I promise I haven't forgotten you and I'm sorry for the long wait. It's been a bit of chaos on my side. Been working 60-80 hour weeks, been trying...
@iSevenDays Added some comments! Not many things that need fixing that I can see. I'm in the MCP code now, which will go a bit quicker because other than looking...
@iSevenDays `By the way, I already found a couple of items for the next PRs: thinking tags support, utf-8 support (emojis etc.), and fixes for coding support (currently, model outputs...
Ok, they should be showing up now; I also resolved the old comments. I'm used to Azure Devops that instantly shows the comments without needing a full review completion.
> Yes, thinking tags support has been done just yesterday, I had to think a lot on how to integrate that with the current limitations. Ok, then in that case...
Adding in that making it a Chat Completions endpoint will not only make the endpoint nearly universal across all LLM applications, as the vast majority of local and proprietary hosting...
I don't believe that endpoint is available. If you go to [server.cpp line 4834](https://github.com/ggml-org/llama.cpp/blob/c81f4192f91a1e209c1eec7a84fe5371ef9175da/tools/server/server.cpp#L4834), you can see the registered endpoints. v1/responses is not one of them.