Jake36921
Jake36921
any way to change the max threads to 12? I saw an earlier [issue,](https://github.com/bes-dev/stable_diffusion.openvino/issues/10) but it didn't seem to work.
I know the location of the model but not the name. How do I replace it with a finetuned model e.g. waifu diffusion?
### Describe the bug Tried to generate response but no output generated. ### Is there an existing issue for this? - [X] I have searched the existing issues ### Reproduction...
(base) PS E:\Games\llama.cpp> python3 convert.py OPT-13B-Erebus-4bit-128g.safetensors --outtype q4_1 --outfile 4ggml.bin Loading model file OPT-13B-Erebus-4bit-128g.safetensors Loading vocab file tokenizer.model Traceback (most recent call last): File "E:\Games\llama.cpp\convert.py", line 1147, in main() File...
Repos like [Gpt4all](https://github.com/nomic-ai/gpt4all), [llama.cpp](https://github.com/ggerganov/llama.cpp), and [alpaca.cpp](https://github.com/antimatter15/alpaca.cpp) runs on the cpu quite fast while using less resources.
Allow cutscenes that aren't rendered to be played instead of a black screen. would be very nice to have.
everything works fine except its using my cpu instead of the gpu.   
Significantly reduces ram/vram usage and faster interference.
he following flags have been taken from the environment variable 'OOBABOOGA_FLAGS': --fkdlsja >nul 2>&1 & python bot.py --token --chat --model-menu To use the CMD_FLAGS Inside webui.py, unset 'OOBABOOGA_FLAGS'. bin E:\etc\bot\ChatLLaMA\oobabooga_windows\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.dll...