Ivan Fioravanti
Ivan Fioravanti
Thanks @madroidmaq this PR is great! Can't wait to see it merged! 🥳
Please create a PR for this, it's great! @tsubasaxZZZ and @xaocon
Ollama is surely fast for inference, being backend by llama.cpp, but having direct Apple MLX can speed up the process fine-tuning, use in dspy, without having to expot to gguf...
Wow this is amazing @meganoob1337
Any news on this one? Parallel requests can be a real game changer for Ollama
same here, multiple 7B models served by an M2 Ultra. My dream! 🙏
I'll give a look at this one soon. Thanks!
Can you please recheck after recent update? I don't see this issue
Yes, all controllers must be configured to be able to see this works
Could you please try with latest version (1.17.3), upgrading az to the latest version before starting? Thanks