zjding comments

Results 6 comments of


                                            zjding

The max tokens of a model deployed using Xinference can only be set to 2048

> > I have upgraded to dify 0.5.3, but the max token can still only be set to 2048. > > > BTW, max_tokens is different from context_size, max_tokens specifies...

The max tokens of a model deployed using Xinference can only be set to 2048

> @lileiseven Something like this. Thanks. Maybe that works on ollama, but the xinference don't have such options, I can only set less than 2048 in dify's interface, is that...

get_data.sh error, url expired

oh, i found it , the url in get_data.sh have been changed to : wget https://msmarco.z22.web.core.windows.net/msmarcoranking/qidpidtriples.train.full.2.tsv.gz wget https://msmarco.z22.web.core.windows.net/msmarcoranking/qrels.train.tsv -O qrels.train.tsv

RuntimeError: Error building extension 'fused_adam'

> 这个应该是和 falsh-attn 有关很有可能，但是turing架构不支持flash-attn，deepspeed一定要用flash-attn吗？deepspeed的配置有点复杂没太看懂。

RuntimeError: Error building extension 'fused_adam'

> 不一定要用 flash-attn 感谢老哥回复，这套框架下关闭flash-attn能简单讲一下吗，我试过关闭还是出错，可能是我方法不对

Bugfix/loop

Support for different models provider such as gemini, deepseek et, and change the tokenizer to match the models.