Ricky Xu

Results 76 comments of Ricky Xu

> Sorry, I have stopped using autoscaler v2. I hope this bug can be fixed in v1 ([ray-project/ray#46492](https://github.com/ray-project/ray/issues/46492)). Sure - I will see if i have time to repro this...

@SimonCqk - would it be possible for you to provide the head node's log? In particularly `gcs_server.out` and `gcs_server.err` if any. It would be nice to see `dashboard.log` and `raylet.out/err`...

So looks like the job manager (which is a ray actor) failed to connect to GCS at its creation. Will you be able to check if `ipv4:192.168.251.238:6379` (where gcs lives)...

not too familiar with kuberay personally , but yeah, I would expect the gcs server port to be available to all other worker pods. cc @architkulkarni

Hey @jmakov - will you be able to get any `monitor.*` logs generated? That would be helpful to debug.

cc @gvspraveen could someone from the cluster team help take a look? I believe this is more relevant to cluster launcher as of now rather than the actual autoscaling logics...