Shadovv_Singer comments

Results 5 comments of


                                            Shadovv_Singer

why some prompt doesn't work, the hidden_states will be nan after GemmaModel.forward

I've encountered a peculiar issue with the Gemma model that I'd like to share. I'm consistently getting a RuntimeError: probability tensor contains either inf, nan or an element < 0...

After deplyed google/gemma-7b-it, there always is error response.

Hi @SedrickWang , I've tried the solution, but it doesn't seem to work. Post #10 also mentioned a 'RuntimeError: probability tensor contains either inf, nan, or an element < 0'...

8 GPUs want to start 8 same models

see https://github.com/ollama/ollama/issues/7206#issuecomment-2413928300 you can try: start multiple ollama on different GPU and different port, use nginx to to distribute the requests I am also very thirsty to this feature (one...

8 GPUs want to start 8 same models

"1.multi ollama serve: failed on gpu split." I have successfully deployed using this method. I use screen to start different terminal, set CUDA_VISIBLE_DEVICES=0/1/2/3/4... (the CUDA number) OLLAMA_HOST= different port I...

🚀 功能建议：希望当GPT类翻译服务折叠时，也能自动翻译

当前似乎折叠意味着不启用，从折叠点开意味着强制启动（无论设置默认启动与否）。你的意思是将折叠概念与不启用概念分离，不管怎么都启动。这似乎与另一帮人的需求冲突（比如我）。我设置了Google翻译和自定义（大模型）翻译，当我只是简单查词时，不希望流式传输内容跳来跳去，Google翻译由足够满足，所以设置了自定义（大模型）翻译默认关闭，手动启用。还有一帮人，希望节省大模型翻译token消耗，所以默认关闭。总而言之，为了综合满足你和以上需求人群的需求，可能需要在设置中增加一个选项：后台自启，虽然功能上增强了，但交互逻辑变复杂了，这需要在软件设计层面取舍，是要简洁性还是操控性。