peft
peft copied to clipboard
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
```python import torch from transformers import AutoModelForSeq2SeqLM,T5Tokenizer from peft import get_peft_config, get_peft_model, TaskType,PrefixTuningConfig,PeftModelForSeq2SeqLM,PeftModel model_name_or_path = "t5-small" tokenizer_name_or_path = "t5-small" model = AutoModelForSeq2SeqLM.from_pretrained(model_name_or_path) tokenizer = T5Tokenizer.from_pretrained(tokenizer_name_or_path) peft_config=PrefixTuningConfig( task_type=TaskType.SEQ_2_SEQ_LM, inference_mode=False, num_virtual_tokens=20) model...
Hello, I am trying to finetune GPT-J for text generation by adapting [this notebook](https://colab.research.google.com/drive/1jCkpikz0J2o20FBQmYmAGdiKmJGOMo-o?usp=sharing). However, when I run `trainer.train` I get a CUDA error that states the following, ` RuntimeError:...
Also added links to datasets and models, plus enhanced config render with yaml command
# Feature request We should leverage `trl`: https://github.com/lvwerra/trl - the recent library from Hugging Face for RLHF, to apply PPO using `peft` and LoRA I think `peft` should just work...
Thank you very much for sharing this library, it is going to be very useful for fine tuning big models. It would be cool if [Donut](https://huggingface.co/docs/transformers/model_doc/donut) model is supported. This...
I think right now, the dtype of prompt embeddings and the model are tied together since the weights are copied. It would be nice to have a different dtype for...
closes: https://github.com/huggingface/peft/issues/62
T-Few is a PEFT method for few-shot learning that is currently the SOTA on many NLP benchmarks. It uses a nifty technique called (IA)^3 to update a small number of...
Changes made by this PR can be summarized as follows: - Set the `cache` and `cache-dependency-path` argument to enable caching to speed up CI times.
why this happening? ``` batch = tokenizer("Two things are infinite: ", return_tensors="pt") with torch.cuda.amp.autocast(): output_tokens = model.generate(**batch, max_new_tokens=50) print("\n\n", tokenizer.decode(output_tokens[0], skip_special_tokens=True)) ``` its give the following error **AttributeError: 'NoneType' object...