Christian
Christian
I think in the report they meant additional GPU memory, let me explain: The model has ~5GB, so I think they calculate the memory: used_memory - 5GB. Therefore indicating how...
The tutorial, and code, that I followed is "guess the correlation" from "https: //torch.mlverse.org /start /guess_the_correlation /". The error message is created when executing the following command: fitted % setup(...
Yes, I saw. But it goes from fp16/bf16 to int4, it would be nice if there were also some intermediate quantifications. Taking Flux as an example the 5/6 bit quants...
Thanks for the info on qwen, regarding offloading I mean loading and unloading the various parts or layers into RAM so as to fit Sana into a 8 or 6...