cdj0311
cdj0311
先用train_model读取model,然后用save_weights_as_checkpoint保存,最后就能用build_transfoermer_model加载了,代码如下: # 重新保存模型 bert, train_model, loss = build_transformer_model_with_lm() train_model.load_weights(model_saved_path) ckpt_path = "./ckpt/model.ckpt" bert.save_weights_as_checkpoint(ckpt_path) # 加载模型 model = build_transformer_model( config_path=config_path, checkpoint_path=checkpoint_path )
> 你和huggingface的GPT结果对比过么? 对比过,huggingface的没问题。
> 您能post一下可复现错误的代码么。我怀疑是模型载入问题?因为GPT2是在wechat线上服务用过的,按理说是可以work的。 ``` import torch import transformers import turbo_transformers import enum import time import numpy class LoadType(enum.Enum): PYTORCH = "PYTORCH" PRETRAINED = "PRETRAINED" NPZ = "NPZ" def test(loadtype: LoadType, use_cuda:...
Great work! Could you provide a script of convert megatron mixtral to hf ?
The same question.