john tong

Results 8 issues of john tong

![企业微信截图_16596898369394](https://user-images.githubusercontent.com/40350896/183041868-62a1255b-38c0-4f0d-9673-2619b6f5c41d.png)

**功能描述 / Feature Description** 提供一个可选配置,可以让用户自由选择使用faiss本地库或milvus远程/分布式向量数据库,作为向量存储 / Describe the desired feature in a clear and concise manner. **解决的问题 / Problem Solved** 当知识库的数据量大的时候(大于200m的txt文件),faiss-cpu的io速度太慢,无法使用gpu,且无法横向扩展;而milvus将在2.3版本支持gpu,且是存算分离架构,云原生友好,部署方便; / Explain how this feature solves existing problems or...

enhancement

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem - 系统环境/System Environment:debian 11 - 版本号/Version:Paddle:2.5.1 PaddleOCR:2.7.0.1 问题相关组件/Related components:cuda - 运行指令/Command Code: ```key python huifu.py ``` - 完整代码: ```python import...

triaged
needs investigation
software compatibility

在转化SUS-Chat-34B模型(该模型完全兼容llama架构)为flm格式时,报了这个错: ```python root@5ce5bafeea81:/app# python glm_trans_flm.py Loading checkpoint shards: 100%|██████████████████████████████████████████████████| 7/7 [01:09

转化脚本如下,模型和lora都可以在huggingface找到: ```python from transformers import AutoTokenizer, AutoModel from peft import PeftModel tokenizer = AutoTokenizer.from_pretrained("./chatglm3-6b", trust_remote_code=True) model = AutoModel.from_pretrained("./chatglm3-6b", trust_remote_code=True, device_map='auto') model = model.half().cuda().eval() # 加入下面这两行,将huggingface模型转换成fastllm模型 # 目前from_hf接口只能接受原始模型,或者ChatGLM的int4, int8量化模型,暂时不能转换其它量化模型 from fastllm_pytools...

转化脚本如下,模型和lora都可以在huggingface找到: ```python from transformers import AutoTokenizer, AutoModel from peft import PeftModel tokenizer = AutoTokenizer.from_pretrained("./chatglm-fitness-RLHF", trust_remote_code=True) model = AutoModel.from_pretrained("./chatglm-fitness-RLHF", trust_remote_code=True, device_map='auto') model = PeftModel.from_pretrained(model, "./Bofan-chatglm-Best-lora") model = model.half().cuda().eval() # 加入下面这两行,将huggingface模型转换成fastllm模型 #...