et-operator
et-operator copied to clipboard
Kubernetes Operator for AI and Bigdata Elastic Training
release note: support scale for trainingjob relatedto #16
- Fix "not exit" to "not exists" in ScaleIn section - Fix "scalein" to "scaleout" in ScaleOut section
Fix word spelling errors "uppdate"
if replica + delta > max or replica + delta < min, fail the job or abort the scaling
Fixes #4
Hi!Is it stable now? Can it be applied to a production environment? thanks!
Now launcher use ssh when attach to workers, there are some problem: 1. it needs workers open sshd 2. controller needs to create a keyPair as secret for every job...