Fred Bliss issues

Results 9 issues of


                                            Fred Bliss

Will loras work with this?

See: https://github.com/tloen/alpaca-lora/blob/main/generate.py Tried modifying the code to look like this, but no luck initially. from peft import PeftModel from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig tokenizer = LLaMATokenizer.from_pretrained("decapoda-research/llama-7b-hf") model = LLaMAForCausalLM.from_pretrained(...

Any plans to add AutoGPTQ as a gptq load option?

See: https://github.com/PanQiWei/AutoGPTQ/ For reference, text gen ui is using it here: https://github.com/oobabooga/text-generation-webui/blob/main/modules/AutoGPTQ_loader.py

good first issue

Instruct tuning for lora/finetune?

Please correct me if I'm wrong, but it looks like the current examples for lora training all build a loss function around completion, which lines up with the lora example...

enhancement

Gemma issues identified by the Unsloth team / impact on mlx code? (shared on our discord as well)

posted this in the discord (https://discord.gg/pEPVK6gGfW) but thanks to the awesome work of the [unsloth](https://github.com/unslothai/unsloth) team, they've identified some bugs in gemma implementations across the ecosystem: https://unsloth.ai/blog/gemma-bugs i think these...

interesting new finetuning approach from stanford - ReFT

https://github.com/stanfordnlp/pyreft uses flash attn and pyvene (https://github.com/stanfordnlp/pyvene) but don't see any specific kernels aside from flashattn. tried this on my cuda machine and it's neat - not sure how effective...

Tokenizer with bos and eos token id sharing and "[WARNING] Example already has an EOS token appended"

Have been using the trainer functionality for awhile, but in trying it with the new [Hugging Face's SmolLM 135M model](https://huggingface.co/HuggingFaceTB/SmolLM-135M), no matter what the dataset, I'd end up with EOS...

[Feature Request / Suggestion]: Is it possible to extend this to text embeddings?

### Suggestion / Feature Request Been curious for awhile now, then moreso since reading Disentangling Dense Embeddings with Sparse Autoencoders (https://arxiv.org/html/2408.00657v2) It looks like most of the ingredients in pyvene...

enhancement

LoRA loading fails if only trained on specific blocks

Just FYI, think this is failing because of a LoRA with only certain blocks trained: ``` File "flux-fp8-api/flux_pipeline.py", line 163, in load_lora self.model = lora_loading.apply_lora_to_model(self.model, lora_path, scale) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".../miniconda3/envs/flux/lib/python3.11/site-packages/torch/utils/_contextlib.py",...