lora
lora copied to clipboard
Fail to resume when training with float16
with the option:
--mixed_precision="fp16" \
it train faster, but when I try to resume train with
--resume_unet="/epoch_41_step_0/lora_weights/lora_e41_s0.pt" \
got error:
TypeError: cannot assign 'torch.HalfTensor' as parameter 'weight'
is there any way to solve this problem?
I try to fix it with:
if loras is not None:
print("########## inject from checkpoint ###########")
_module._modules[name].lora_up.weight = torch.nn.Parameter(torch.tensor(loras.pop(0)).float().detach())
_module._modules[name].lora_down.weight = torch.nn.Parameter(torch.tensor(loras.pop(0)).float().detach())
but fail
think it's related to pytorch version issues, just modify sections of the code will do ... require_grad_params.append(_module._modules[name].lora_up.parameters()) require_grad_params.append(_module._modules[name].lora_down.parameters()) wt_tensor_type=_module._modules[name].lora_up.weight.dtype if loras != None: _module._modules[name].lora_up.weight.data = loras.pop(0).to(wt_tensor_type) _module._modules[name].lora_down.weight.data = loras.pop(0).to(wt_tensor_type) ...