hpnyaggerman

Results 14 comments of hpnyaggerman

The error seems to stem from using two GPUs for training. Using just one makes the issue go away.

No prototype? That seems like an awfully needed feature, with most available language models not being able to process more than 2k tokens at once

Just download the original models here: https://github.com/facebookresearch/llama/pull/73

Sounds like something worth investigating

In my experience, this PR worked a bit better than the KoboldAI API implementation on the text-generation-webgui. For example, with `--auto-devices --gpu-memory 8 --cpu-memory 45 --no-stream --extensions api` on text-generation-webgui...

> In my experience, this PR worked a bit better than the KoboldAI API implementation on the text-generation-webgui. For example, with `--auto-devices --gpu-memory 8 --cpu-memory 45 --no-stream --extensions api` on...

Related to https://github.com/TavernAI/TavernAI/issues/76 I am pretty sure

Same issue with https://huggingface.co/reeducator/vicuna-13b-free

Has anyone found a solution? What is even causing the issue?