bitsandbytes
bitsandbytes copied to clipboard
chore: delete useless buffered activation
For QLoRA models, we do not need to update the $\mathbf{W}$, so the buffered activation of $\mathbf{A}$ is useless. It is suggested not to save $\mathbf{A}$ in ctx to save the memory.