qiankunli

Results 23 comments of qiankunli

@phlogistonjohn you can ignore the "uat and product environment", when I restart the ceph-service pod, the warnings stop. ceph version 14.2.21 (5ef401921d7a88aea18ec7558f7f9374ebd8f5a6) nautilus (stable) ceph library : libcephfs/librados2/librbd1 version ```...

@phlogistonjohn just introduce the our usage of go-ceph I need a restful api which support curd(create/update/read/delete) the cephfs directory, so that some service(java/go etc) can request it easy without integrating...

for `inqueue := minReq.Add(attr.allocated).Add(attr.inqueue).Sub(attr.elastic).LessEqual(attr.realCapability, api.Infinity)`, if we use `(1,1,1)` to represent `(cpu=1,mem=1,gpu=1)`, `(1,1,0) + (6,6,6) + (0,0,0) - (0,0,0) = (7,7,6) < (8,16,2) ==> false`, In fact, the remaining resources...

@Garrybest my department is machine learning platforms. and my workload is tfjob/pytorchjob/vcjob(volcano). is it useful to make sure that the workload can not be separated to multiple clusters? ```yaml apiVersion:...

it may be that the queue2 and queue3 are calculated first, and then the remaining resources is less than 80, but the queue1' guarantee resources comes with an 80, which...

> What's the model relationship between queue and nodegroup. queue and nodegroup are many-to-many relationships > Do we allow to configure two queues that affinity the same nodegroup? yes

@shinytang6 @william-wang The upstream controller component will create a pod when the pending podgroup becomes Inqueue, and the CA(cluster autoscaler) will decide to scale up the node because of the...

some advise 1. in AI/spark case, there are always two role of pod: master/worker or driver/executor, if master pod is evicted, the worker pod will be fail too, so it...

+1 from a user who would benefit from this enhancement! I am looking forward to it.