mylinfh
Results
2
comments of
mylinfh
Okay, thank you. I'll try again.llama.cpp/ollama can be used, but the inference time seems to be longer
Mmm, yes, Thanks for your reply. I can run llama3 using nanoLLM. But I also tried deploying inference on nanollm and mlc using Llama2-7B, both of which were fast, but...