cortex.cpp
cortex.cpp copied to clipboard
bug: slow response after terminate the conversation
Describe the bug close / terminate the conversation of the streaming output will introduce slow performance of new chat
To Reproduce Steps to reproduce the behavior:
- start the check enable with stream
- when nitro streaming text output, close the chat from client
- in server log, it displayed "Task completed, release it - llamaCPP.cc:416"
- start the new chat, it will be slow response until restart nitro from server
Desktop (please complete the following information):
- OS: Mac
- Browser chrome