l0d0v1c
l0d0v1c
`echo "window.location.replace('/voila/render/index.ipynb');" > /usr/local/share/jupyter/voila/templates/base/tree.html` Then run voila in the folder index.ipynb is
Merging is implemented here https://github.com/mzbac/mlx-lora but I didn't find yet how to convert to gguf
Thank you @awni . MLX fine tuning is very good on mistral. A pity we can't get a gguf compatible for llama.cpp. or maye reverse quantisation to HF format?
Succeeded by using fuse.py `python fuse.py —model mlx_model —save-path ./fuse —adapter-file adapater.npz` then rename weights.00.safetensors to model.safetensors. The convert.py from llama.cpp works fine afterward. ``` python [convert.py](http://convert.py/) ./fuse ./quantize ./fuse/ggml-model-f16.gguf...
Yes exactly
@USMCM1A1 you have to clone llama.cpp repo then "make" is enough on mac. rename weights.00 to model python convert.py thedirectoryofyourmodel It will produce a file "ggml-model-f16.gguf" in the same directory...
@USMCM1A1 my project if also linguistic (ancient greek). I'm not computer scientist as well but I play with buttons.
@USMCM1A1 I work on a AI able to deal with Diogenes and Antisthene philosophy. The results are just incredible. Happy you succeeded. I sent you a linkedin invitation to share...
Thanks Awni. I simply use ```python lora.py --model ./mixtral4b --adapter-file adapters-mixtral.npz --train --iters 1000 --lora-layers 16``` with the same dataset I used to fine tune mistral7B. On a M2 Max,...
> Which Mixtral model are you using? Is it quantized / fp16 /bf16 ? I will try running on the WikiSQL example dataset, but it may not reproduce there which...