CogVLM icon indicating copy to clipboard operation
CogVLM copied to clipboard

为啥会出现numpy的问题呢

Open kkkwjr opened this issue 1 year ago • 1 comments

(CogAgent) (.conda) (base) wpg@node7gpu:/workspace/kkkjr/Item/CogVLM/basic_demo$ torchrun --standalone --nnodes=1 --nproc-per-node=2 cli_demo_sat.py --from_pretrained cogagent-chat --version chat --bf16 /home/wpg/.local/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.) cpu = _conversion_method_template(device=torch.device("cpu")) W1129 05:48:09.364000 64664 torch/distributed/run.py:793] W1129 05:48:09.364000 64664 torch/distributed/run.py:793] ***************************************** W1129 05:48:09.364000 64664 torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W1129 05:48:09.364000 64664 torch/distributed/run.py:793] ***************************************** /home/wpg/.local/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.) cpu = _conversion_method_template(device=torch.device("cpu")) /home/wpg/.local/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.) cpu = _conversion_method_template(device=torch.device("cpu")) Traceback (most recent call last): Traceback (most recent call last): File "/workspace/kkkjr/Item/CogVLM/basic_demo/cli_demo_sat.py", line 7, in File "/workspace/kkkjr/Item/CogVLM/basic_demo/cli_demo_sat.py", line 7, in from sat.model.mixins import CachedAutoregressiveMixinfrom sat.model.mixins import CachedAutoregressiveMixin

File "/home/wpg/.local/lib/python3.10/site-packages/sat/init.py", line 1, in File "/home/wpg/.local/lib/python3.10/site-packages/sat/init.py", line 1, in from .arguments import get_args, update_args_with_filefrom .arguments import get_args, update_args_with_file

File "/home/wpg/.local/lib/python3.10/site-packages/sat/arguments.py", line 23, in File "/home/wpg/.local/lib/python3.10/site-packages/sat/arguments.py", line 23, in import numpy as np ModuleNotFoundError: No module named 'numpy'import numpy as np

ModuleNotFoundError: No module named 'numpy' E1129 05:48:11.400000 64664 torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 0 (pid: 64829) of binary: /usr/bin/python3 Traceback (most recent call last): File "/home/wpg/.local/bin/torchrun", line 8, in sys.exit(main()) File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 355, in wrapper return f(*args, **kwargs) File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

cli_demo_sat.py FAILED

Failures: [1]: time : 2024-11-29_05:48:11 host : node7gpu rank : 1 (local_rank: 1) exitcode : 1 (pid: 64830) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure): [0]: time : 2024-11-29_05:48:11 host : node7gpu rank : 0 (local_rank: 0) exitcode : 1 (pid: 64829) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

image 这里我的已经有了numpy,不知道为啥还是出现这个情况

kkkwjr avatar Nov 29 '24 05:11 kkkwjr

numpy最高安装1.26.3版本的,不然会出现各种稀奇古怪的错误,版本不能太高。

MachineDora avatar Dec 06 '24 02:12 MachineDora