ray icon indicating copy to clipboard operation
ray copied to clipboard

"executable file `python` not found in $PATH" when using runtime_env container in cluster based on anyscale/ray-ml:nightly-py38-cpu image.

Open onlyone2019 opened this issue 3 years ago • 0 comments

What happened + What you expected to happen

ray job submit --address='http://192.168.0.192:8265' --runtime-env-json='{"working_dir":"./","container":{"image": "anyscale/ray-ml:nightly-py38-cpu", "worker_path": "/root/python/ray/workers/default_worker.py", "run_options": ["--cap-drop SYS_ADMIN","--log-level=debug"]}}' -- python ./debug.py

I submitted a job using above command, but I didn't get the result of f(x). It seems like hanged and stunk at building runtime_env. Also, the raylet.err reminded me "executable file python not found in $PATH" and I don't know how to fix it.

This is the feedback:

Job submission server address: http://192.168.0.192:8265
2022-08-10 14:36:41,777	INFO dashboard_sdk.py:319 -- Package gcs://_ray_pkg_698a6544fb43c3a9.zip already exists, skipping upload.

-------------------------------------------------------
Job 'raysubmit_G3syLLn4YmfZ28um' submitted successfully
-------------------------------------------------------

Next steps
  Query the logs of the job:
    ray job logs raysubmit_G3syLLn4YmfZ28um
  Query the status of the job:
    ray job status raysubmit_G3syLLn4YmfZ28um
  Request the job to be stopped:
    ray job stop raysubmit_G3syLLn4YmfZ28um

Tailing logs until the job exits (disable with --no-wait):

I knew the job status through ray job status raysubmit_G3syLLn4YmfZ28um:

Job submission server address: None
2022-08-10 14:42:06,737	INFO dashboard_sdk.py:129 -- No address provided, defaulting to http://localhost:8265.
Status for job 'raysubmit_G3syLLn4YmfZ28um': PENDING
Status message: Job has not started yet, likely waiting for the runtime_env to be set up.

I got some error messages from raylet.err:

time="2022-08-10T14:43:46+08:00" level=debug msg="Received: -1"
time="2022-08-10T14:43:46+08:00" level=debug msg="Cleaning up container 5076a7f5e866e4d7f1afa374a36dcc5c5b561477172319c54ebb023b08f45c83"
time="2022-08-10T14:43:46+08:00" level=debug msg="Network is already cleaned up, skipping..."
time="2022-08-10T14:43:53+08:00" level=debug msg="unmounted container \"5076a7f5e866e4d7f1afa374a36dcc5c5b561477172319c54ebb023b08f45c83\""
time="2022-08-10T14:43:54+08:00" level=debug msg="ExitCode msg: \"executable file `python` not found in $path: no such file or directory: oci runtime     attempted to invoke a command that was not found\""
Error: executable file `python` not found in $PATH: No such file or directory: OCI runtime attempted to invoke a command that was not found
[2022-08-10 14:43:58,130 E 1481817 1481817] (raylet) worker_pool.cc:500: Some workers of the worker process(1487985) have not registered within the t    imeout. The process is dead, probably it crashed during start.
time="2022-08-10T14:43:58+08:00" level=warning msg="Error validating CNI config file /home/wangjie/.config/cni/net.d/87-podman.conflist: [failed to f    ind plugin \"bridge\" in path [/usr/local/libexec/cni /usr/libexec/cni /usr/local/lib/cni /usr/lib/cni /opt/cni/bin] failed to find plugin \"firewall    \" in path [/usr/local/libexec/cni /usr/libexec/cni /usr/local/lib/cni /usr/lib/cni /opt/cni/bin]]"
Error: executable file `python` not found in $PATH: No such file or directory: OCI runtime attempted to invoke a command that was not found

Versions / Dependencies

ray : 3.0.0.dev0 python : 3.8

Reproduction script

debug.py

import ray
ray.init()

@ray.remote
def f(x):
    return x * x

futures = [f.remote(i) for i in range(2)]
print(ray.get(futures))

Issue Severity

High: It blocks me from completing my task.

onlyone2019 avatar Aug 10 '22 07:08 onlyone2019

Is this the same question asked on discuss: https://discuss.ray.io/t/how-does-container-in-runtime-env-work/7108

jjyao avatar Aug 11 '22 16:08 jjyao

@jjyao yes, I posted it on Tuesday.

onlyone2019 avatar Aug 12 '22 01:08 onlyone2019