ray
ray copied to clipboard
"executable file `python` not found in $PATH" when using runtime_env container in cluster based on anyscale/ray-ml:nightly-py38-cpu image.
What happened + What you expected to happen
ray job submit --address='http://192.168.0.192:8265' --runtime-env-json='{"working_dir":"./","container":{"image": "anyscale/ray-ml:nightly-py38-cpu", "worker_path": "/root/python/ray/workers/default_worker.py", "run_options": ["--cap-drop SYS_ADMIN","--log-level=debug"]}}' -- python ./debug.py
I submitted a job using above command, but I didn't get the result of f(x). It seems like hanged and stunk at building runtime_env. Also, the raylet.err reminded me "executable file python not found in $PATH" and I don't know how to fix it.
This is the feedback:
Job submission server address: http://192.168.0.192:8265
2022-08-10 14:36:41,777 INFO dashboard_sdk.py:319 -- Package gcs://_ray_pkg_698a6544fb43c3a9.zip already exists, skipping upload.
-------------------------------------------------------
Job 'raysubmit_G3syLLn4YmfZ28um' submitted successfully
-------------------------------------------------------
Next steps
Query the logs of the job:
ray job logs raysubmit_G3syLLn4YmfZ28um
Query the status of the job:
ray job status raysubmit_G3syLLn4YmfZ28um
Request the job to be stopped:
ray job stop raysubmit_G3syLLn4YmfZ28um
Tailing logs until the job exits (disable with --no-wait):
I knew the job status through ray job status raysubmit_G3syLLn4YmfZ28um:
Job submission server address: None
2022-08-10 14:42:06,737 INFO dashboard_sdk.py:129 -- No address provided, defaulting to http://localhost:8265.
Status for job 'raysubmit_G3syLLn4YmfZ28um': PENDING
Status message: Job has not started yet, likely waiting for the runtime_env to be set up.
I got some error messages from raylet.err:
time="2022-08-10T14:43:46+08:00" level=debug msg="Received: -1"
time="2022-08-10T14:43:46+08:00" level=debug msg="Cleaning up container 5076a7f5e866e4d7f1afa374a36dcc5c5b561477172319c54ebb023b08f45c83"
time="2022-08-10T14:43:46+08:00" level=debug msg="Network is already cleaned up, skipping..."
time="2022-08-10T14:43:53+08:00" level=debug msg="unmounted container \"5076a7f5e866e4d7f1afa374a36dcc5c5b561477172319c54ebb023b08f45c83\""
time="2022-08-10T14:43:54+08:00" level=debug msg="ExitCode msg: \"executable file `python` not found in $path: no such file or directory: oci runtime attempted to invoke a command that was not found\""
Error: executable file `python` not found in $PATH: No such file or directory: OCI runtime attempted to invoke a command that was not found
[2022-08-10 14:43:58,130 E 1481817 1481817] (raylet) worker_pool.cc:500: Some workers of the worker process(1487985) have not registered within the t imeout. The process is dead, probably it crashed during start.
time="2022-08-10T14:43:58+08:00" level=warning msg="Error validating CNI config file /home/wangjie/.config/cni/net.d/87-podman.conflist: [failed to f ind plugin \"bridge\" in path [/usr/local/libexec/cni /usr/libexec/cni /usr/local/lib/cni /usr/lib/cni /opt/cni/bin] failed to find plugin \"firewall \" in path [/usr/local/libexec/cni /usr/libexec/cni /usr/local/lib/cni /usr/lib/cni /opt/cni/bin]]"
Error: executable file `python` not found in $PATH: No such file or directory: OCI runtime attempted to invoke a command that was not found
Versions / Dependencies
ray : 3.0.0.dev0
python : 3.8
Reproduction script
debug.py
import ray
ray.init()
@ray.remote
def f(x):
return x * x
futures = [f.remote(i) for i in range(2)]
print(ray.get(futures))
Issue Severity
High: It blocks me from completing my task.
Is this the same question asked on discuss: https://discuss.ray.io/t/how-does-container-in-runtime-env-work/7108
@jjyao yes, I posted it on Tuesday.