eigenLiu issues

Results 8 issues of


                                            eigenLiu

微调后怎么转回原来基座的格式

**Describe the feature** 微调后的模型面目全非，huggingface再也load不进去了。vllm等加速框架也用不了了 **Paste any useful information** 微调完的是adapter_model.safetensors文件，我给他强制改为model.safetensors，然后覆盖到原来的基座的文件夹中结果load进来，推理完全不对。 **Additional context** 求方法，怎么转回去。

question

mlp_only_layers is more flexible than decoder_sparse_step

Before this pr, the config field decoder_sparse_step decides which layers don't have experts, that means it can choose Qwen2MoeSparseMoeBlock or Qwen2MoeMLP for layers. however, the choose policy is not flexible...

swift export指定输出路径要--啥

如题谢谢

Sync huggingface modifications of qwen Moe model

nearly huggingface merged my pr：https://github.com/huggingface/transformers/pull/30552/files i introduced a new config "mlp_only_layers" to qwen Moe model, i think vllm should keep the same model forward logic as huggingface model definations. so...

[Bug] 推理停不下来

### Checklist - [X] 1. I have searched related issues but cannot get the expected help. - [X] 2. The bug has not been fixed in the latest version. -...

[Feature] 我们支持gptq量化模型的推理么

### Motivation 只见对awq的支持，未见对gptq的探讨 ### Related resources _No response_ ### Additional context _No response_

[Enhancement]: about clinet.insert

### Is there an existing issue for this? - [X] I have searched the existing issues ### What would you like to be added? the concept need to be cleared...

kind/enhancement

[Bug] python客户端，多 master 无 slave，无压力，概率出现分钟级delay

### Before Creating the Bug Report - [x] I found a bug, not just asking a question, which should be created in [GitHub Discussions](https://github.com/apache/rocketmq/discussions). - [x] I have searched the...