Isaack Karanja

Results 1 issues of Isaack Karanja

* Introduce flag to terminate jobs on MegaScale Runtime Errors * Enable auto-restart of jax process when errors occur * Prevent silent hangs in multi-slice TPU configurations * Reduce time...