Manan Shah
Results
2
issues of
Manan Shah
### 🐛 Describe the bug The tolerance when comparing loss in gemma3 multimodal model need to be set high (atol,rtol - 1e-3) compare to others (atol=1e-8,rtol=1e-5) in order to pass...
Hello, So we discussed an approach to solve this: Running all the benchmarks take less than an hour. I tried it on a single H100 GPU and it took me...