llm-export
llm-export copied to clipboard
phi-1_5模型导出来,推理不正常。
原生的只支持 phi-2,理论上对 phi-1_5也是支持的,下面是我参考 phi-2的写了load_model函数,模型倒是能正常导出来,但好像是一些参数对不上,导出来 的模型推理也不正常,麻烦帮忙看下,非常感谢!这是模型地址:https://huggingface.co/microsoft/phi-1_5
下面是我写的load_model函数:
def load_model(self):
transformer = self.model.model
self.lm = self.model.lm_head
self.embed_ = transformer.embed_tokens
self.hidden_size = self.embed_.weight.shape[-1]
self.blocks_ = transformer.layers
self.final_layernorm_ = transformer.final_layernorm
# Some wrapper
self.stop_ids.append(self.tokenizer.eos_token_id)
self.block_nums = len(self.blocks_)
self.embed = Embedding(self.embed_, self.embed_bf16)
self.lm = Lm(self.lm)
self.blocks = [PHI2Block(self.blocks_[i], i, self.hidden_size) for i in range(self.block_nums)]
# Some config for export
self.past_kv_shape = [len(self.blocks), 1, 0, 2, 32, 80]
self.block_dynamic_axes = {
"inputs_embeds" : { 0: "seq_len" },
"attention_mask" : { 2: "seq_len", 3: "seq_len" },
"position_ids" : { 0: "seq_len" },
"past_key_values" : { 1: "history_len" }
}
self.model_dynamic_axes = {
"input_ids" : { 0: "seq_len" },
"attention_mask" : { 2: "seq_len", 3: "seq_len" },
"position_ids" : { 0: "seq_len" },
"past_key_values" : { 2: "history_len" }
}