PeiqinSun

Results 8 issues of PeiqinSun

Thank you for your amazing works. I has some questions. In my cifar-10 experiments, i use 4x4 blocks, and cv2.dct. which like: ```python def run_DCT(self, signal): rows = (signal.shape[0] //...

We try to implement 4bit-qlora, thanks to the optimized kernel implementation of back-propagation, the fine-tuning speed is similar to 8-bit lora at present. Welcome to use and issue: https://github.com/megvii-research/Sparsebit/tree/main/large_language_models/alpaca-qlora

Thanks for your nick work first! But when I use the flash_attention_n, I found a bug which happened in setting attn_mask from None to attention_mask. How can I fix it?...

Hi, your open-source work is excellent. While reproducing snan-print, I noticed that in the training script, the load_from attribute for the student is set to a checkpoint that can already...