john tong issues

Results 8 issues of


                                            john tong

elasticsearch8以上不支持，一打包就报错啊啊啊啊啊

![企业微信截图_16596898369394](https://user-images.githubusercontent.com/40350896/183041868-62a1255b-38c0-4f0d-9673-2619b6f5c41d.png)

[FEATURE] 能否增加对于milvus向量数据库的支持 / Concise description of the feature

**功能描述 / Feature Description** 提供一个可选配置，可以让用户自由选择使用faiss本地库或milvus远程/分布式向量数据库，作为向量存储 / Describe the desired feature in a clear and concise manner. **解决的问题 / Problem Solved** 当知识库的数据量大的时候(大于200m的txt文件)，faiss-cpu的io速度太慢，无法使用gpu，且无法横向扩展；而milvus将在2.3版本支持gpu，且是存算分离架构，云原生友好，部署方便； / Explain how this feature solves existing problems or...

enhancement

修复node-all等grafana darshboard因采样时间间隔太短导致没有数据的问题

更换了高精度的检测模型det和识别模型rec以后，cuda会报错: [Hint: 'CUDNN_STATUS_NOT_SUPPORTED'. The functionality requested is not presently supported by cuDNN. ]

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem - 系统环境/System Environment：debian 11 - 版本号/Version：Paddle：2.5.1 PaddleOCR：2.7.0.1 问题相关组件/Related components：cuda - 运行指令/Command Code： ```key python huifu.py ``` - 完整代码： ```python import...

triaged

needs investigation

software compatibility

老哥，笔记可以支持下载吗

转化模型格式(.bin->.flm)时

在转化SUS-Chat-34B模型(该模型完全兼容llama架构)为flm格式时，报了这个错： ```python root@5ce5bafeea81:/app# python glm_trans_flm.py Loading checkpoint shards: 100%|██████████████████████████████████████████████████| 7/7 [01:09

某次Merge更新后(约10月30号前后)，所有flm模型的推理质量出现大幅度下降，怀疑有显存提前被释放了，这里以chatglm3为例

转化脚本如下，模型和lora都可以在huggingface找到： ```python from transformers import AutoTokenizer, AutoModel from peft import PeftModel tokenizer = AutoTokenizer.from_pretrained("./chatglm3-6b", trust_remote_code=True) model = AutoModel.from_pretrained("./chatglm3-6b", trust_remote_code=True, device_map='auto') model = model.half().cuda().eval() # 加入下面这两行，将huggingface模型转换成fastllm模型 # 目前from_hf接口只能接受原始模型，或者ChatGLM的int4, int8量化模型，暂时不能转换其它量化模型 from fastllm_pytools...

chatglm模型转化为flm格式后，推理结果质量暴降(主要表现在总结能力)，在其他issue也看到类似的问题

转化脚本如下，模型和lora都可以在huggingface找到： ```python from transformers import AutoTokenizer, AutoModel from peft import PeftModel tokenizer = AutoTokenizer.from_pretrained("./chatglm-fitness-RLHF", trust_remote_code=True) model = AutoModel.from_pretrained("./chatglm-fitness-RLHF", trust_remote_code=True, device_map='auto') model = PeftModel.from_pretrained(model, "./Bofan-chatglm-Best-lora") model = model.half().cuda().eval() # 加入下面这两行，将huggingface模型转换成fastllm模型 #...