thot experiment
thot experiment
I'd been dealing with this same bug (thought it was my own off-by-one, was driving me up the wall) but it seems to be better now. I can repro the...
Just successfully launched, changed nothing just ran the script again on the off chance it would work. ¯\\\_(ツ)\_/¯
fwiw I still occasionally get this, it was always intermittent and it seems to be better now, but it does still happen from time to time, i just bumped to...
Same bug, anyone know which commit this broke on? Was working maybe 2 or 3 days ago. I'll look into this. I believe going forward we should probably switch to...
So it looks like this API is something that's autogenerated by Gradio itself (sorry again for the naivete here I really have no idea what's going on) and because of...
"Transformers bump" commit ruins gpt4-x-alpaca if using an RTX3090: model loads, but talks gibberish
Ok, so I'm trying to gather all info I can about this gibberish issue as it appears to persist for me regardless of tokenizer config as per this comment [#1029](https://github.com/oobabooga/text-generation-webui/issues/1029#issuecomment-1502539767)...
I have the same issue as of a recent commit, and I am not using `--no-half-vae` happens on both GV100 and 1080Ti, using torch2 and xformers, will try falling back...
FWIW i do not have `--opt-sdp-no-mem-attention` set explicitly, but perhaps it gets turned on implicitly by some other flag or configuration state? (i don't even see it listed in [the...
FWIW I'm able to run 3bit x 65b LLaMa on a single 32gb GPU using AutoGPTQ which is kinda neat and it seems to be close to 65b q4 in...
> @ortegaalfredo tried and got the same issue I've been running into the same issue trying to run 65B on a heterogeneous system w/ a 1080Ti 11Gb + GV100 32Gb...