PAMD.LA
PAMD.LA
> I had a similar issue. Turns out I had to **open all the ports for my server in the network firewall** as Client tries to connect to assigned port...
推理慢是有很多原因的,一个原因就是模型很大(参数太多),我用gpt4all的cpu版本其速度是能接受的,作为对比的话。 所以在想,MOSS有没有可能也做到那种程度。对于某些只有CPU大集群的公司来说,是有益的。
> 您只需要修改model_inference.py去掉.cude()和.to("cude")即可 好的,我找找。
```bash # wandb local --upgrade wandb: WARNING `wandb local` has been replaced with `wandb server start`. ``` my os: MBP MacOS 12.6.3 wandb: 0.17.1 not working here ...
great, i made it. however, reranker models are really helpful for rag, please make a plan for them.
解决了吗?我试着注册qwq32b的gguf也出了这个问题
I installed xinf image via `docker pull xprobe/xinference:v1.4.1`. Does this image have no cuda ?
yeah, but Idid not find a way out... kinda complicated if modifying frontend codes...
no, I am not good at frontend like this kind of frameworks.