rlscope icon indicating copy to clipboard operation
rlscope copied to clipboard

[question] I have an error.

Open pixar0407 opened this issue 4 years ago • 0 comments

After installing the whole things and run the code but i got error.

rls-prof python main_rlscope.py --rlscope-directory ./rlscope_tutorial

So, copy and paste the CMD and one of log flies Any advise? (I got 8 gpu at this moment)

CMD line

CMD: $ rls-calibrate run --verbosity progress --parallel-runs python main_rlscope.py --rlscope-directory ./rlscope_tutorial PWD=/home/ygkim/drl/Mujoco-Pytorch INFO | Run configurations: ./rlscope_tutorial/config_time_breakdown_repetition_* ./rlscope_tutorial/config_calibration_uninstrumented_repetition_* ./rlscope_tutorial/config_calibration_interception_repetition_* ./rlscope_tutorial/config_calibration_gpu_activities_repetition_* ./rlscope_tutorial/config_calibration_no_gpu_activities_repetition_* ./rlscope_tutorial/config_calibration_gpu_activities_api_time_repetition_* ./rlscope_tutorial/config_calibration_just_pyprof_annotations_repetition_* ./rlscope_tutorial/config_calibration_just_pyprof_interceptions_repetition_* INFO | Running configurations... CMD: $ rls-run-expr --verbosity progress --skip-final-error-message --run-sh --sh ./rlscope_tutorial/run_expr.sh PWD=/home/ygkim/drl/Mujoco-Pytorch 0% (0 of 8) | | Elapsed Time: 0:00:00 ETA: --:--:--ERROR | Saw failed cmd in GPU[2] worker. CMD: logfile=./rlscope_tutorial/config_calibration_uninstrumented_repetition_01/logfile.out $ rls-prof --no-calibrate --config uninstrumented python main_rlscope.py --rlscope-calibration --rlscope-directory /home/ygkim/drl/Mujoco-Pytorch/rlscope_tutorial/config_calibration_uninstrumented_repetition_01 --rlscope-disable ERROR | Saw failed cmd in GPU[3] worker. CMD: logfile=./rlscope_tutorial/config_calibration_gpu_activities_repetition_01/logfile.out $ rls-prof --no-calibrate --config gpu-activities python main_rlscope.py --rlscope-calibration --rlscope-directory /home/ygkim/drl/Mujoco-Pytorch/rlscope_tutorial/config_calibration_gpu_activities_repetition_01 --rlscope-disable-pyprof ERROR | Saw failed cmd in GPU[1] worker. CMD: logfile=./rlscope_tutorial/config_calibration_interception_repetition_01/logfile.out $ rls-prof --no-calibrate --config interception python main_rlscope.py --rlscope-calibration --rlscope-directory /home/ygkim/drl/Mujoco-Pytorch/rlscope_tutorial/config_calibration_interception_repetition_01 --rlscope-disable-pyprof ERROR | Saw failed cmd in GPU[0] worker. CMD: logfile=./rlscope_tutorial/config_time_breakdown_repetition_01/logfile.out $ rls-prof --no-calibrate --config time-breakdown python main_rlscope.py --rlscope-calibration --rlscope-directory /home/ygkim/drl/Mujoco-Pytorch/rlscope_tutorial/config_time_breakdown_repetition_01 ERROR | Saw failed cmd in GPU[7] worker. CMD: logfile=./rlscope_tutorial/config_calibration_just_pyprof_interceptions_repetition_01/logfile.out $ rls-prof --no-calibrate --config uninstrumented python main_rlscope.py --rlscope-calibration --rlscope-directory /home/ygkim/drl/Mujoco-Pytorch/rlscope_tutorial/config_calibration_just_pyprof_interceptions_repetition_01 --rlscope-disable-tfprof --rlscope-disable-pyprof-annotations ERROR | Saw failed cmd in GPU[5] worker. CMD: logfile=./rlscope_tutorial/config_calibration_gpu_activities_api_time_repetition_01/logfile.out $ rls-prof --no-calibrate --config gpu-activities-api-time python main_rlscope.py --rlscope-calibration --rlscope-directory /home/ygkim/drl/Mujoco-Pytorch/rlscope_tutorial/config_calibration_gpu_activities_api_time_repetition_01 --rlscope-disable-pyprof ERROR | Saw failed cmd in GPU[6] worker. CMD: logfile=./rlscope_tutorial/config_calibration_just_pyprof_annotations_repetition_01/logfile.out $ rls-prof --no-calibrate --config uninstrumented python main_rlscope.py --rlscope-calibration --rlscope-directory /home/ygkim/drl/Mujoco-Pytorch/rlscope_tutorial/config_calibration_just_pyprof_annotations_repetition_01 --rlscope-disable-tfprof --rlscope-disable-pyprof-interceptions ERROR | Saw failed cmd in GPU[4] worker. CMD: logfile=./rlscope_tutorial/config_calibration_no_gpu_activities_repetition_01/logfile.out $ rls-prof --no-calibrate --config no-gpu-activities python main_rlscope.py --rlscope-calibration --rlscope-directory /home/ygkim/drl/Mujoco-Pytorch/rlscope_tutorial/config_calibration_no_gpu_activities_repetition_01 --rlscope-disable-pyprof 100% (8 of 8) |###########################################################################################| Elapsed Time: 0:00:02 Time: 0:00:02 ERROR | At least one run configuration failed; see their logfiles for details.

one of log file

CMD: $ rls-prof --no-calibrate --config time-breakdown python main_rlscope.py.py --rlscope-calibration --rlscope-directory /home/ygkim/drl/Mujoco-Pytorch/rlscope_tutorial/config_time_breakdown_repetition_01 PWD=/home/ygkim/drl/Mujoco-Pytorch Environment: CUDA_VISIBLE_DEVICES=0 CMD: $ /home/ygkim/anaconda3/envs/rlscope/bin/python main_rlscope.py.py --rlscope-calibration --rlscope-directory /home/ygkim/drl/Mujoco-Pytorch/rlscope_tutorial/config_time_breakdown_repetition_01 PWD=/home/ygkim/drl/Mujoco-Pytorch Environment: LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64::/home/ygkim/.mujoco/mjpro150/bin:/home/ygkim/.mujoco/mujoco200/bin:/home/ygkim/anaconda3/envs/rlscope/lib/python3.6/site-packages/rlscope/cpp/lib LD_PRELOAD=:librlscope.so RLSCOPE_CONFIG=time-breakdown RLSCOPE_CUDA_ACTIVITIES=yes RLSCOPE_CUDA_API_CALLS=yes RLSCOPE_CUDA_API_EVENTS=yes RLSCOPE_GPU_HW=no RLSCOPE_PC_SAMPLING=no RLSCOPE_TRACE_AT_START=no /home/ygkim/anaconda3/envs/rlscope/bin/python: error while loading shared libraries: libnvperf_host.so: cannot open shared object file: No such file or directory

pixar0407 avatar Mar 15 '22 01:03 pixar0407