beapirate
Results
2
comments of
beapirate
> It would be great if agent developers could agree on a common standard for this. And I think Claude Code could be a pioneer by implementing this. Except for...
I can reproduce NaN issues reliably by training a HF modernbert model in fp16 without using AMP, no problems with AMP thus far. Training a HF ModernBERT model in pure...