kfac icon indicating copy to clipboard operation
kfac copied to clipboard

KFAC in early stages query

Open priyamiitkgp opened this issue 4 years ago • 1 comments

Hi, I ran the notebook given in the docs (KERAS KFAC example for CIFAR 10) , with the same network (Resnet-20) and parameters (tuned hyperparameters) and compared the first few epochs to a standard SGD opt (lr = 0.1). The issue is that I didn't see KFAC opt being significantly faster (14x) than the SGD opt. In most loss vs epoch plots, I see KFAC is supposed to drop much faster than others (like SGD), but that wasn't the case.

Would be great if you could help me understand where I might be going wrong. I've attached a training accuracy plot comparing KFAC and SGD.

Thanks!! Screenshot (190)

priyamiitkgp avatar Feb 05 '21 13:02 priyamiitkgp

Hi. That "14x" figure applies only to a certain architecture, and isn't meant to be universal. However, I can see from the README that the phrasing suggests otherwise, and so I've removed it. So far, the most compelling applications of K-FAC that I'm aware of are to deep autoencoders and vanilla networks using DKS/TAT. See https://arxiv.org/abs/2110.01765

james-martens avatar Feb 09 '22 01:02 james-martens