DirtyKnightForVi comments

Results 24 comments of


                                            DirtyKnightForVi

Request and model concurrency

> This change adds support for multiple concurrent requests, as well as loading multiple models by spawning multiple runners. This change is designed to be "opt in" initially, so the...

Request and model concurrency

> > > This change adds support for multiple concurrent requests, as well as loading multiple models by spawning multiple runners. This change is designed to be "opt in" initially,...

Multi-image mixed input

> Does DeepSeek-VL series support input of multiple images? This doesn't seem to be stated in the paper, but `images` in the example script are `list`, which seems to be...

Error trying Quantize 7B model to 2-bit

Simply uninstall GPTQ completely and then reinstall it to solve this problem.

Error trying Quantize 7B model to 2-bit

> I didn't test on windows 11 but it should work if you have a GPU. Can you double check if your gptq installation is completely successful? when i trying...

Error trying Quantize 7B model to 2-bit

> Facing the exact same issue as @DirtyKnightForVi Follow this link : [https://github.com/juncongmoo/pyllama/issues/35](url) you'd better uninstall transformers and reinstall using ' pip install git+https://github.com/mbehm/transformers '

我有更多的3070显卡，可以组织成替代3090显卡的计算能力么？

3070的卡设计应该是支持24G的，只是推理速度会比较慢。你可以找靠谱的显卡维修佬，给你升级显存。参考B站的靓女维修佬，她把2080 11G 升级成了22G。

please support deepseek-ai/DeepSeek-V2-Chat

+1 The DeepSeek-V2-Chat is a different MOE architecture compared to DBRX, Mixtral, and GLM-4. In terms of API experience, it's on par with GPT-4

OLLAMA_NUM_PARALLEL cannot be the solution.

> > Hi @0sengseng0 it seems you're missing an O: `OLLAMA_NUM_PARALLEL=3` > > Will close this as it is definitely plan A :) > > According to the logs, n_ctx...

OLLAMA_NUM_PARALLEL cannot be the solution.

in my case, my prompt as follows: # DDL ```sql create table XXXXX (about 12000 tokens) ``` # HINT (about 500 tokens, notes, etc) if OLLAMA_NUM_PARALLEL=2，model can not get table...