TavernAI No support for the new Ooba Booga API (OpenAI API)

With the newest Version of the Ooba Booga Text Generation WebUI they replaced the old KoboldAI API with the new OpenAI compatible API. My Tavern Ai wasn't able to connect to this new API either with the Text generation web Ul API setting or the OpenAI API setting.

Dec 07 '23 16:12 NZ3digital

You can try the "reverse proxy" option. But tavern disconnects after every message for me.

Dec 10 '23 23:12 gnometsunami

the api still does not work and reverse proxi does not work either since apparently tavern expects the response faster then it comes and disconnects after a few seconds even though i can see the webui generates the response after a bit of loading but just not as fast as tavern expects. Please help.

Jan 23 '24 01:01 accessyapps

for now can anyone maybe send a link to the last version that had the old api or tell me a way to get it back? I had a version but it had a mayor bug and often just quit out of nowhere without an error.

Jan 23 '24 02:01 accessyapps

for now can anyone maybe send a link to the last version that had the old api

Yes.

The api was changed on November 12th 2023. The previous release as of the 11th of November was from the 5th.

Here is the link to the zip file: https://github.com/oobabooga/text-generation-webui/archive/refs/tags/snapshot-2023-11-05.zip

This is the direct link to the snapshot release that is the last Text Gen Web UI to use the "old" API. https://github.com/oobabooga/text-generation-webui/releases/tag/snapshot-2023-11-05

However I would encourage you to consider the viability/validity of KoboldCpp as a direct replacement for Ooba. Currently KCpp has a slight speed/efficiency advantage over TextGen WebUI as far as both performance on identical hardware, and compatibility with front-ends and models etc... KCpp has always had a major lead over Ooba as far as "user-frustration" is concerned. KCpp is fully compatible with any downloaded model format that you may already have previously ran locally using Ooba on your system. It is also entirely self contained and needs no extra libraries, drivers, run-times, etc. koboldcpp.exe is a true single file solution.

Here's a direct link to latest KoboldCPP.exe if you want to have a go at it. (This is the full-featured version which takes advantage of CUDA, however if you have an ATI Radeon or Intel Arc GPU those are fine as well, but the binaries for those are in entirely different repositories.)

koboldcpp.exe link: https://github.com/LostRuins/koboldcpp/releases/download/v1.55.1/koboldcpp.exe

There is literally only that one single file to download, it is all you need. Once downloaded, open a command prompt at or navigate your command prompt to wherever koboldcpp.exe is located. Here is an example set of args which would run a Q4_K_M quantized (GGUF) 13B-sized LLM on a machine with an nVidia GPU having 12 gigabytes or more of VRAM:

koboldcpp.exe --model C:\path\to\model-13B-Q4_K_M.gguf --port 5000 --smartcontext --contextsize 4096 --usecublas 0 mmq --gpulayers 43

You mentioned (And I admit to 100% missing it the first time! I do apologize.) that you use a screen-reader, and bearing that in mind I went down the command prompt route first. KoboldCpp does also have a functional GUI, one can actually just drag and drop the model.gguf file onto koboldcpp.exe itself and most of the time it will have determined the correct settings automatically before the GUI pops up and asks if you're ready to load it up; but I just now had a quick peek under the hood of the KCpp windows launcher-GUI... I'm fairly confident, now, that a screen-reader will have similar odds to "a snowball's chance in you-know-where" of being able to read the launcher-GUI in any meaningful capacity. While that is unfortunate from an accessibility perspective, that is not actually going to matter one iota here... I use KCpp from the command line myself and there's nothing at all that requires using the GUI. (I should also mention that I intentionally omitted the --quiet argument from my example command line above, feel free to include it once what little bit of "Hey boss! Things are currently happening right now!" verbosity not having the --quiet actually gives is no longer necessary!)

I'm not sure what hardware you have been running your models on locally (using Ooba)... As I said that set of args target the use case of: "13B size model(s) in Q4_K_M quantized GGUF format being loaded into an nVidia GPU that has >=12GiB of VRAM onboard."(also presuming windows, but the Linux command line would be same same.) If your GPU is only a 6 gig card, you likely thus already have Q4_K_M quants of 7B sized models that Ooba has previously ran on that hardware, in the case of Model-7B-Q4_K_M.gguf would be 22 gpu layers instead of the given 43.

KoboldCpp Official repository link: https://github.com/LostRuins/koboldcpp

Direct Readme.md link for same: https://github.com/LostRuins/koboldcpp/blob/concedo/README.md

Again, feel free to pop into the TavernAI Discord and drop me an @FunKengine (I am happy to use voice chat as well.)

I completely understand why TavernAI's uncluttered UI is very preferable to you versus SillyTavern's UI. (Which is admittedly visually appealing (Ross is good at making beautiful UI.) but undeniably very much indeed 'jam-packed to the gunnels' with settings, extras, plug-ins, and purely visual 'fluff' when compared to TavernAI) Getting you back up and running sooner than TavernAI can get the TextGen WebUI API option fixed in the code is legitimately important to me, being one of the TavernAI maintainers.

Whether that (getting you running again.) involves using the previous version of Text Generation WebUI downgraded over your current incompatible Ooba presently installed; or if it is me providing the necessary support/answers that you will need to change over to using KoboldCpp as a direct equivalent replacement to Ooba. (It's actually a more than just an equal to Ooba. KCpp has a small performance edge over Ooba, and there are also a handful of things KCpp can do today, that Ooba can not do. ("Yet." I'm sure it is only a matter of time.) Even potentially assisting with getting them both working on your end would be something I'd be happy to do. Ease into the transition.

However for... at least the foreseeable future, TavernAI and the 'new' Ooba API will continue to be "on the rocks with each other" and to continue using TavernAI, which I am happy you do, effectively means that you would be 'stuck with' deprecated Ooba circa November 5th. Inevitably this means you will be missing out on the newer functionality and/or improvements to speed/resource consumption etc...

Whereas if I help get you fully back up to speed at or above the your level of familiarity with Ooba, but with KCpp instead... (This is likely a lot easier than you think it might be... KCpp will be very familiar to you as an Ooba user, the similarities between Ooba and KCpp far exceed the things that are different right down to the commonality of using the GGUF quantized models that you already have on your system, which makes KCpp closer to Ooba from the end-user-perspective and effectively very different from the KoboldAI-Client that it gets the name from! So if you looked into KoboldAI-Client in the past, and it was not your cup of tea, that's fine, because KCpp is an entirely different kettle of fish than KoboldAI-Client.)

I prefer not to leave you handcuffed ta a deprecated version of Text Gen WebUI if I can at all avoid doing so... Neither do I wish to strong-arm you into joining "Team Kobold..." Especially if all you really want/need is "The Ooba that still works with TavernAI." :)

--Josh AKA FunkEngine

Jan 23 '24 06:01 FunkEngine2023

I prefer not to leave you handcuffed ta a deprecated version of Text Gen WebUI if I can at all avoid doing so... Neither do I wish to strong-arm you into joining "Team Kobold..." Especially if all you really want/need is "The Ooba that still works with TavernAI." :)

Legendary explanation, thank you I've met this issue yesterday, will try Koboldcpp today

Feb 17 '24 14:02 always-oles

I prefer not to leave you handcuffed ta a deprecated version of Text Gen WebUI if I can at all avoid doing so... Neither do I wish to strong-arm you into joining "Team Kobold..." Especially if all you really want/need is "The Ooba that still works with TavernAI." :)

Legendary explanation, thank you I've met this issue yesterday, will try Koboldcpp today

https://github.com/TavernAI/TavernAI/blob/main/colab/TAI%E2%99%A5KCpp_RC_3.0.ipynb

Feb 17 '24 14:02 FunkEngine2023

My experience of using google collab is awful, this time I don't see a python code in block because they overlap with something opening code in the new tab works, but man... So I had to try Kobold locally and it works perfect, just what I needed

@FunkEngine2023 I appreciate the effort you put in that collab, thanks for help!

Feb 18 '24 14:02 always-oles

My experience of using google collab is awful, this time I don't see a python code in block because they overlap with something

"They can't break what they can't see." (Hide the form shows the code.)

Joe Newbie User cannot accidentally type something into the colab code that breaks it if they cannot see the code, in an effort to ensure the "first time user experience" works...

Feb 18 '24 14:02 FunkEngine2023

@FunkEngine2023 oh, so it was supposed to be like that... Now it's clear for me, all those "LINK IS NOT FOR YOU" stuff makes sense now! But that was not my main problem, after running all the scripts it just wasn't working for me in the collab, model didn't load because of different errors. I debugged and re-run scripts from the beginning multiple times. After few hours I switched to local kobold and problem was solved

Feb 18 '24 18:02 always-oles

You guys should probably remove ooba webui from the list of supported backends since it's not supported anymore. Wasted my time with this for no reason.

Feb 24 '24 05:02 noidedxyz

This will be fixed within a couple of days.

Feb 24 '24 11:02 Borov666

Fixed in https://github.com/TavernAI/TavernAI/commit/a1f84af753fd49a8c3e40c5406e8670be17ae3dc It works with standard textgen-webui api flag. Just in case, I'm leaving the thread open for a while.

Feb 26 '24 01:02 Borov666

Thanks for updating the api, it appears to work just fine. From the above I assumed there wasn't any intentions on updating this any further. Much appreciated.

Feb 27 '24 16:02 noidedxyz

Not sure what happened here, but it wasn't working for me today when I tried it. These are the steps to make it work.

Don't use the .exe, use the bat file
open in visual studio code. regex replace ( = ["'])(no_connection) with $1bypass_$2
this replaces data.result = "no_connection" and online_status = 'no_connection' with data.result = "bypass_no_connection" and online_status = 'bypass_no_connection' 3.5. this is what the changes look like https://github.com/arrmansa/TavernAI/commit/f3872cb54522abb338e415ca4cffde605470eeed
It still throws an error on the first message, but keeps working fine (does not force disconnect)

This is what the ui looks like after everything, (you can use other words than bypass, it makes no difference)

May 10 '24 20:05 arrmansa