UltraChat icon indicating copy to clipboard operation
UltraChat copied to clipboard

加载UltraChat65B进行微调的问题

Open jinmin527 opened this issue 2 years ago • 0 comments

def get_model_tokenizer(args): model = LlamaForCausalLM.from_pretrained(args.model_name_or_path) tokenizer = LlamaTokenizer.from_pretrained(args.model_name_or_path) tokenizer.add_special_tokens({'pad_token': ""}) model.resize_token_embeddings(len(tokenizer)) model = bmt.BMTrainModelWrapper(model) return model, tokenizer

假设在单机8卡服务器上,加载UltraChat65B的模型进行微调,会不会存在OOM的问题?每个卡都会执行model = LlamaForCausalLM.from_pretrained(args.model_name_or_path)加载一份模型,哪怕存CPU内存,65B大概需要130G的内存,8卡差不多需要1T的内存,而服务器总内存也差不多1T。

jinmin527 avatar Aug 19 '23 11:08 jinmin527