optimum-graphcore
optimum-graphcore copied to clipboard
Set LayerNorm's eps to a number that is larger than 6e-5
Currently, many LayerNorm's eps are smaller than 6.1e-5 (smallest fp16 value), which might cause underflow.
@michaelbenayoun this is something that could be done easily as a torch.fx transform?
I would think this is something handled by pytorch, but to answer you @jimypbr, yes definitely!