Zhanpeng Zhou
Results
2
issues of
Zhanpeng Zhou
simply fix the inefficient problem of combine_batches in main_fedavg.py. After simple test, the program runs correctly.
https://github.com/lucidrains/lion-pytorch/blob/6a74fdc0ba572ab5683dc0270c66c20ecbc02d09/lion_pytorch/lion_pytorch.py#L79 Decoupled decay refers to isolate the weight decay from the "gradient". The usual way to apply weight decay is to add a L2 regularization in the loss function. For...