Yixin Jin

Results 2 comments of Yixin Jin

But it says in the paper that "During test time we turn batch-normalization update off to obtain an output that deterministically depends only on the input [32]."

same error here? does this mean I cannot use fp8 model?