Elephant Panda
Elephant Panda
Please support saving quantized models into 4bit, 2bit and 1bit. 1) To make smaller file sizes 2) To support quantized models.
Has this been abandoned? Any news of c# training roadmap? Are there any builds with c# training?
I've tried 384x384 and not got bad results. 256x256 though is bad.
Free the weights! I will pay ten dollars for download costs.
There is a 7B weight model I heard. I mean the float16 version would then be about 14GB. I guess this is out of reach for the 3080. (Maybe it...
> I believe with CPU offloading, it should be possible. > > A related guie: https://huggingface.co/docs/accelerate/usage_guides/big_modeling Interesting... this might be useful for other models I'm running. I have a feeling...
> FlexGen 4bit is already here, so I think you can run a 20B model locally, let's wait and see... I've no doubt it can be run locally. The question...
> No chance Well it did #105 . (12GB RAM and 16GB VRAM) 😎 Lesson - don't listen do doubters. 😁
7B in float16 with be 14GB and if quantized to uint8 could be as low as 7GB. But on the graphics cards, from what I've tried with other models it...
> With KoboldAi I was able to run GPT J 6b on my 8gb 3070 ti by offloading the model to my ram How fast was it?