DP-FTRL
DP-FTRL copied to clipboard
FTRL is identical to SGD for unconstrained problem when no noise is added
On line https://github.com/google-research/DP-FTRL/blob/main/optimizers.py#L59,
Why is ms + (-gs - nz) / alpha? This makes FTRL not FTRL is identical to SGD if learning rate is not 1.0. Shouldn't it be
ms + (-gs - nz) * alpha?
Hi! Sorry for my really late reply. I'm not sure if I fully understand your question, but I think it's likely due to the mismatch between the parameters: the alpha is here is the inverse of the learning rate. Does that make sense?