jiaxin11 comments

Results 5 comments of


                                            jiaxin11

Increasing memory usage in Core

Now that the exadel/compreface-core:1.2.0-arcface-r100-gpu version is released, has the oom problem been fixed?

> 如何保持模型在内存中或立即卸载？ > > 默认情况下，模型在内存中保留5分钟后会被卸载。这样做可以在您频繁请求LLM时获得更快的响应时间。但是，您可能希望在5分钟结束之前释放内存或无限期保持模型加载。使用/api/generate和/api/chat API端点的keep_alive参数来控制模型在内存中保留的时间。 > > keep_alive参数可以设置为： > > 一个持续时间字符串（例如"10m"或"24h"）一个以秒为单位的数字（例如3600）任何负数，将会无限期保持模型在内存中（例如-1或"-1m"） '0'，将在生成响应后立即卸载模型例如，要预加载模型并保留在内存中，请使用 > > curl http://localhost:11434/api/generate -d '{"model": "llama3", "keep_alive": -1}' 要卸载模型并释放内存，请使用： > > curl http://localhost:11434/api/generate...

how to use it

为什么我调用后，只是日的显示。怎么用啊？

Proguard Keep Code

thanks

jiaxin11

美化单文件版本

Increasing memory usage in Core

基于知识库的问答响应缓慢：每次提问都释放并重新加载模型

how to use it

Proguard Keep Code