beapirate

Results 2 comments of beapirate

> It would be great if agent developers could agree on a common standard for this. And I think Claude Code could be a pioneer by implementing this. Except for...

I can reproduce NaN issues reliably by training a HF modernbert model in fp16 without using AMP, no problems with AMP thus far. Training a HF ModernBERT model in pure...