guiniao comments

Results 22 comments of


                                            guiniao

怎么释放GPU内存，使用torch.cuda.empty_cache()不起作用

@sujunze @ixjx @dizhenx ，是的，我同时清空了history和torch.cuda.empty_cache()，起作用了，torch.cuda.empty_cache()这个之前没起作用是位置不对

怎么释放GPU内存，使用torch.cuda.empty_cache()不起作用

@controZheng , 你是不是清理的卡和位置不对，看下你用的是哪张卡，清理对应的卡，在predict函数后，return之前释放下。 def torch_gc(): if torch.cuda.is_available(): with torch.cuda.device('cuda:1'): torch.cuda.empty_cache() torch.cuda.ipc_collect()

怎么释放GPU内存，使用torch.cuda.empty_cache()不起作用

你是不是一次推理的token太长，导致一次推理就把显存拉满了，根本就没有清空的机会 ---原始邮件--- 发件人: ***@***.***> 发送时间: 2023年6月7日(周三) 下午5:01 收件人: ***@***.***>; 抄送: ***@***.******@***.***>; 主题: Re: [THUDM/ChatGLM-6B] 怎么释放GPU内存，使用torch.cuda.empty_cache()不起作用 (Issue #1144) @controZheng , 你是不是清理的卡和位置不对，看下你用的是哪张卡，清理对应的卡，在predict函数后，return之前释放下。 def torch_gc(): if torch.cuda.is_available(): with torch.cuda.device('cuda:1'): torch.cuda.empty_cache() torch.cuda.ipc_collect() 确实不行我检查很多次位置和history...

ImportError: cannot import name 'AsyncOpenAI' from 'openai'

已解决，跟openai的版本没有直接的关系，我的问题在于把服务器的gcc从4.85升级到9.5，导致报了第一个错误，gcc版本重新回到4.85就没问题了

Return grouped query results

@QingZ11 ,你好，请看下图片，我想实现根据圈中每个组的数据一起返回，就是整个图书库中，有关系的一组一组数据一起返回，这样好知道哪些节点之间是有关系的，通过limit和offset能实现吗

推理占的显存不能释放吗，随着问题增多，显存溢出

我试下，我有两张显卡，也在moss_gui_demo.py中设置了os.environ['CUDA_VISIBLE_DEVICES']='0,1'，但是推理只使用了一张卡，这个设置不起作用吗

推理占的显存不能释放吗，随着问题增多，显存溢出

@zhiqix ，我试下，我有两张显卡，也在moss_gui_demo.py中设置了os.environ['CUDA_VISIBLE_DEVICES']='0,1'，但是推理只使用了一张卡，这个设置不起作用吗

设置两张卡，只使用了一张卡，导致显存溢出报错

不是，第一张卡已经在推理了，随着推理进行，一张卡显存占用越来越多，直接蹦了，另一张卡还空着

设置两张卡，只使用了一张卡，导致显存溢出报错

@AllenWGX ，moss后面代码更新了，现在他自己处理好了，不需要自己处理了，现在只有非量化模型支持双卡推理，4bit，8bit这种不支持，代码里写了

[Bug][ChatDB]Using qwen-1.5-14b-chat, there are more than a dozen tables in the library, but when asking for table information, it says that there are only four tables in the library. Is it because the number of model tokens is not enough or is there another problem

我使用通义千问-Turbo，返回的也只有4张表，但库里有很多张表