Kelly A

Results 3 issues of Kelly A

### **Steps to reproduce:** 1. Set the PyTorchJob restartPolicy: ExitCode 2. Set backoffLimit > 1 3. Have a container exit with a non-zero exit code greater than 128 ### **Observed...

kind/feature

### Observed Problem **I tested this with PyTorchJobs, but presumably this would apply to other job types as well** If you fully shutdown a node that the job is running...

kind/feature

## Is your feature request related to a problem? Please describe. Some tasks may require different resource requirements than others. For example, prompt tuning a small model may require 1...