Fred Bliss

Results 9 issues of Fred Bliss

Hi, love this project as a way to learn from scratch with local development. I was able to finetune the model, generate the checkpoints, generate the samples. Is there an...

See: https://github.com/tloen/alpaca-lora/blob/main/generate.py Tried modifying the code to look like this, but no luck initially. from peft import PeftModel from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig tokenizer = LLaMATokenizer.from_pretrained("decapoda-research/llama-7b-hf") model = LLaMAForCausalLM.from_pretrained(...

See: https://github.com/PanQiWei/AutoGPTQ/ For reference, text gen ui is using it here: https://github.com/oobabooga/text-generation-webui/blob/main/modules/AutoGPTQ_loader.py

good first issue

Please correct me if I'm wrong, but it looks like the current examples for lora training all build a loss function around completion, which lines up with the lora example...

enhancement

posted this in the discord (https://discord.gg/pEPVK6gGfW) but thanks to the awesome work of the [unsloth](https://github.com/unslothai/unsloth) team, they've identified some bugs in gemma implementations across the ecosystem: https://unsloth.ai/blog/gemma-bugs i think these...

https://github.com/stanfordnlp/pyreft uses flash attn and pyvene (https://github.com/stanfordnlp/pyvene) but don't see any specific kernels aside from flashattn. tried this on my cuda machine and it's neat - not sure how effective...

Have been using the trainer functionality for awhile, but in trying it with the new [Hugging Face's SmolLM 135M model](https://huggingface.co/HuggingFaceTB/SmolLM-135M), no matter what the dataset, I'd end up with EOS...

### Suggestion / Feature Request Been curious for awhile now, then moreso since reading Disentangling Dense Embeddings with Sparse Autoencoders (https://arxiv.org/html/2408.00657v2) It looks like most of the ingredients in pyvene...

enhancement

Just FYI, think this is failing because of a LoRA with only certain blocks trained: ``` File "flux-fp8-api/flux_pipeline.py", line 163, in load_lora self.model = lora_loading.apply_lora_to_model(self.model, lora_path, scale) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".../miniconda3/envs/flux/lib/python3.11/site-packages/torch/utils/_contextlib.py",...