在网页版的demo中推理仅仅只能进行语言模型的对话,没有多模态的推理
例如: 我已经微调好了Yi-VL-6B,想使用网页版推理进行对话,但是推理页面没有专门的图片输入选项,只能进行chat,希望可以改进一下
例如: 我已经微调好了Yi-VL-6B,想使用网页版推理进行对话,但是推理页面没有专门的图片输入选项,只能进行chat,希望可以改进一下
不对,应该是yi-vl-6B没有图片的输入也无法进行chat,网页版推理仅对纯语言模型有效
是的, 网页端暂时不支持多模态模型的推理, 可以使用swift infer进行命令行的推理
是的, 网页端暂时不支持多模态模型的推理, 可以使用进行命令行的推理
swift infer
好的谢谢,那能不能通过一个循环,将一个json文件中的关键字段全部提取出来,然后json中的每个数据都推理一遍
类似于这样(这是Qwen-VL的批量预测的代码): def _load_model_tokenizer(): tokenizer = AutoTokenizer.from_pretrained( DEFAULT_CKPT_PATH, trust_remote_code=True, resume_download=True, )
device_map = "cuda"
model = AutoPeftModelForCausalLM.from_pretrained(
DEFAULT_CKPT_PATH, # path to the output directory
device_map="cuda",
trust_remote_code=True
).eval()
# model.generation_config = GenerationConfig.from_pretrained(
# DEFAULT_CKPT_PATH, trust_remote_code=True, resume_download=True,
# )
return model, tokenizer
def parse_text(text):
lines = text.split("\n")
lines = [line for line in lines if line != ""]
count = 0
for i, line in enumerate(lines):
if "```" in line:
count += 1
items = line.split("") if count % 2 == 1: lines[i] = f'<pre><code class="language-{items[-1]}">' else: lines[i] = f"<br></code></pre>" else: if i > 0: if count % 2 == 1: line = line.replace("", r"`")
line = line.replace("<", "<")
line = line.replace(">", ">")
line = line.replace(" ", " ")
line = line.replace("*", "*")
line = line.replace("", "_")
line = line.replace("-", "-")
line = line.replace(".", ".")
line = line.replace("!", "!")
line = line.replace("(", "(")
line = line.replace(")", ")")
line = line.replace("$", "$")
lines[i] = "
" + line
text = "".join(lines)
return text
def predict(message): start = time.time() message = _parse_text(message) print("用户: " + _parse_text(message)) history = [] response, history = model.chat(tokenizer, message, history=history) full_response = _parse_text(response) print("Qwen-VL-Chat: " + _parse_text(full_response)) print(f"耗时{time.time()-start}") return full_response
model, tokenizer = _load_model_tokenizer()
if name == 'main': with open(r'your/path', 'r', encoding='utf-8') as f: valid_data = json.load(f) data_list = [] start = time.time() for index, i in enumerate(valid_data): image_id = i.get("id") print(f'进度{index + 1}/{len(valid_data)};{image_id}') conversations = i.get("conversations") query = conversations[0].get("value") true_ = conversations[1].get("value") full_response = predict(query) print("实际: ", true_) data = {"图像名称": image_id, "实际结果": true_, "预测结果": full_response, "查询": query} data_list.append(data) print(f"耗时{time.time()-start}")
output_path = 'your/predict.json'
with open(output_path, 'w', encoding='utf-8') as f:
json.dump(data_list, f, ensure_ascii=False, indent=2)
print(f"输出保存在 {output_path}")
是的, 网页端暂时不支持多模态模型的推理, 可以使用
swift infer进行命令行的推理
此前的1.5.4版本可以加载qwen-vl-chat,图片的推理也可以用“<img>xxx.jpg</img>“来进行输入,结果到了1.7.0这个也不支持了,请问这个能恢复吗?