CustomSuggestionServiceForCopilotForXcode icon indicating copy to clipboard operation
CustomSuggestionServiceForCopilotForXcode copied to clipboard

Support New Fill In The Middle API for Ollama

Open rwebb-fundrise opened this issue 1 year ago • 18 comments

The Ollama project recently merged this PR, which adds support for fill-in-the-middle completions via the existing generate endpoint. Would love to see this supported in Custom Suggestion Service as well.

rwebb-fundrise avatar Aug 29 '24 19:08 rwebb-fundrise

Do you know which model supports the new API? I have tried several models but all of them complain that model does not support insert.

intitni avatar Sep 13 '24 08:09 intitni

This would be really useful. I know that the Continue plugin for VS Code works perfectly with Ollama/codellama/starcoder2. Maybe that would be a starting point?

jmitek avatar Sep 17 '24 12:09 jmitek

@jmitek This model does support the new API, but the result is super weird and worse than using the Fill-in-the-middle strategy and fill in the template manually. Are you using 3b or the 7b model?

(Note, you can already use FIM supported models as a completion API as long as you know the template)

intitni avatar Sep 17 '24 12:09 intitni

@intitni I'm mostly using codellama:13b. Also starcode2:15b. I've no idea what the template might look like for them though.

jmitek avatar Sep 17 '24 13:09 jmitek

@jmitek The default one is for codellama. The starcoder one looks like

<fim-prefix>def fib(n):<fim-suffix>    else:\n        return fib(n - 2) + fib(n - 1)<fim-middle>

intitni avatar Sep 17 '24 13:09 intitni

Interesting. So if connect it to codellama , here are my settings: image

This is the kind of completion I get in Xcode: image

Using "Default" has similar results. I notice that it is using the /chat api, though I would have expected /generate api instead?

But if I choose "Continue" one it looks a bit more reasonable, though it seems to duplicate the preceding lines of code. Haven't tried starcoder2 yet

jmitek avatar Sep 17 '24 13:09 jmitek

@jmitek Please change the model to completion API and set it up again. The imported models will be treated as chat completions API. You may also need the -code, too.

Screenshot 2024-09-17 at 21 29 54

intitni avatar Sep 17 '24 13:09 intitni

@intitni Thanks, so I made it use /generate again and filled in the template exactly as you have it (the default one I had was different). I can see it is using /generate now, suggestions seems okay, except still have the duplicated code. See this example: Before: image

After: image

jmitek avatar Sep 17 '24 13:09 jmitek

@jmitek it works fine on my Mac though. What's your settings again?

Screenshot 2024-09-17 at 21 43 05

intitni avatar Sep 17 '24 13:09 intitni

filled in the template exactly as you have it (the default one I had was different)

Oh the default one is actually correct, I was testing starcoder2 when I made the screenshot.

intitni avatar Sep 17 '24 13:09 intitni

here my updated settings: image

I see that you are using codellama:7b-code, I'm using the non "-code" version, if that makes any difference?

jmitek avatar Sep 17 '24 13:09 jmitek

@jmitek You need to reset the template to the default one

intitni avatar Sep 17 '24 13:09 intitni

I don't know much about the code suffix, I found it in the documentation https://ollama.com/library/codellama in the Fill in the middle section

intitni avatar Sep 17 '24 13:09 intitni

Awesome!, I pulled codellama:13b-code and use this from the Ollama site (https://ollama.com/library/codellama:13b-code): <PRE> {prefix} <SUF>{suffix} <MID> which is the same as the default one in the app. So it seems to break with the non "-code" version - although Continue plugin is somehow able to use the non "-code" version... Anyway it works perfectly :)

jmitek avatar Sep 17 '24 14:09 jmitek

Maybe they are using their own prompt strategy. Can you see what the prompt they are sending to Ollama?

intitni avatar Sep 17 '24 14:09 intitni

I have tried the model starcode2 with ollama. The completion result is different with continue. I found the request of continue have a parameter 'raw' which is set to true. I change the source code , add 'raw' parameter in the request, then the completion is the same with continue.

sonofsky2010 avatar Sep 21 '24 03:09 sonofsky2010

@sonofsky2010 Hi, I have tried the raw parameter but I am still getting weird output from starcoder2:7b (there are template tags in the output). Do you have a complete request body the Continue plugin sent?

intitni avatar Sep 24 '24 07:09 intitni

@intitni I check the request of continue. I think it maybe caused by the wrong 'stop' parameters. Now it send 'stop' with empty list. It maybe override the stop parameters which ollama read from the model parameter.

sonofsky2010 avatar Sep 30 '24 17:09 sonofsky2010