Kenwwww comments

Results 8 comments of


                                            Kenwwww

关于QWen模型是否支持一次性返回多个function_call的功能

能否开源parallel function call的微调训练方案呢？我自己直接拼接指令，function直接列表形式这样构建指令集，效果不太好

DeepSeek-Coder-V2-Lite模型代码补全微调

> 是将代码切分成`{code}{code}`的形式喂给大模型就可以了吗？无监督微调。您好最后您是怎么实现把组件库信息练到大模型里面的呢我也有类似的需求

[REQUEST]: 想用Qwen2.5-coder-7B做continue pretraining，能不能给一个llama-factory模板？

> ### Has this been supported or requested before? > * [x] I have checked [the GitHub README](https://github.com/QwenLM/Qwen2.5).[x] I have checked [the Qwen documentation](https://qwen.readthedocs.io).[x] I have checked the documentation of...

The calls to the large model insert are being rate-limited. How can I limit the concurrent requests to 5?

working_dir=WORKING_DIR, enable_llm_cache=True, best_model_func=llm_model_if_cache, cheap_model_func=llm_model_if_cache, best_model_max_async = 5, cheap_model_max_async =5,

xinference无故重启，掉模型

这个问题有进展吗，我的也是每次部署完，过了很长一段时间后模型就自动被下掉了，显存也没有释放，必须手动重启服务

相基于公司自有组件、代码生成代码

无，终止探索了，幻觉太严重，rag不精准，有进展了记得回来通知下我。 ---原始邮件--- 发件人: ***@***.***> 发送时间: 2025年8月12日(周二) 下午3:07 收件人: ***@***.***>; 抄送: ***@***.******@***.***>; 主题: Re: [deepseek-ai/DeepSeek-Coder] 相基于公司自有组件、代码生成代码 (Issue #649) wangbei-github left a comment (deepseek-ai/DeepSeek-Coder#649) 需要用增量预训练吗，但是依照以往增量预训练的经验，幻觉都是特别严重，我现在有很多组件说明文档、开发规范文档、历史项目代码，如何得到一个能够基于公司自有组件生成代码的代码模型呢？同问。请问后续有结论吗？ — Reply to this email...

关于知识注入有效性的问题

> 同样的问题，增量预训练的效果该如何量化，是否有必要进行增量预训练最近跑了一次cpt，全参条件下，感觉只要保证语料大模型原本没见过，是有效果的。所以我感觉若要私域知识注入的话，还是可以跑。如果只是要增强模型某一块的知识理解和利用，感觉没必要cpt了，直接上sft冷启动+rl吧。我个人感觉这个法律场景，确实没必要做cpt，有点为了做而做的意思。

[Feature]: Limit thinking tokens

> I have come up with logits processor approach: > > ``` > from typing import List, Optional > import torch > > class ThinkLogitsProcessor: > """A logits processor that...