Shadovv_Singer
Shadovv_Singer
I've encountered a peculiar issue with the Gemma model that I'd like to share. I'm consistently getting a RuntimeError: probability tensor contains either inf, nan or an element < 0...
Hi @SedrickWang , I've tried the solution, but it doesn't seem to work. Post #10 also mentioned a 'RuntimeError: probability tensor contains either inf, nan, or an element < 0'...
see https://github.com/ollama/ollama/issues/7206#issuecomment-2413928300 you can try: start multiple ollama on different GPU and different port, use nginx to to distribute the requests I am also very thirsty to this feature (one...
"1.multi ollama serve: failed on gpu split." I have successfully deployed using this method. I use screen to start different terminal, set CUDA_VISIBLE_DEVICES=0/1/2/3/4... (the CUDA number) OLLAMA_HOST= different port I...
当前似乎折叠意味着不启用,从折叠点开意味着强制启动(无论设置默认启动与否)。 你的意思是将折叠概念与不启用概念分离,不管怎么都启动。 这似乎与另一帮人的需求冲突(比如我)。 我设置了Google翻译和自定义(大模型)翻译,当我只是简单查词时,不希望流式传输内容跳来跳去,Google翻译由足够满足,所以设置了自定义(大模型)翻译默认关闭,手动启用。 还有一帮人,希望节省大模型翻译token消耗,所以默认关闭。 总而言之,为了综合满足你和以上需求人群的需求,可能需要在设置中增加一个选项:后台自启,虽然功能上增强了,但交互逻辑变复杂了,这需要在软件设计层面取舍,是要简洁性还是操控性。