Lily Erickson

Results 6 issues of Lily Erickson

JAX is already a library that is optimized for GPU training, and the NeoX repo itself already requires significant GPU resources that could benefit from offloading.

enhancement

This should be compatible with Deepspeed, would it be possible to add a basic training pipeline?

enhancement

When you say train_on_inputs = False, I presume you mean to mask out the prompt, and train the loss only on the response that the model is supposed to produce....

https://github.com/huggingface/peft/issues/285 Still trying to sort out how to proceed on my own branch, but I figured I'd bring it up here too, since it'll invalidate the codebase, and maybe more...

So in the previous peft version, before the recent adalora changes, set_peft_model_state_dict returned a wrapped model object. Now, it appears to function as a mutator (returns None). So I changed...

The attribute self.model_max_length is not universally set in all tokenizers. If these cases, the tokenizer will crash the program without the listed change. I noticed it specifically when loading tokenizers...