FedML icon indicating copy to clipboard operation
FedML copied to clipboard

Bugs while running ~/FedML/python/examples/simulation/mpi_torch_async_fedavg/torch_fedavg_mnist_lr_custum_data_and_model_example.py

Open yaokunxu opened this issue 2 years ago • 0 comments

when i follow the Readme to test the program, the problems come out First one: I have no idea about why this would happen and get no way to know how does it work , Anyway the code run "run_custom_data_and_model_example.sh: 11: -hostfile: not found" and "run.sh: 19: -host: not found" 1 Second one: It may indicate that there are some bugs in the code "[FedML-Server @device-id-0] [Sun, 16 Jul 2023 12:46:33] [ERROR] [mlops_runtime_log.py:36:handle_exception] Uncaught exception Traceback (most recent call last): File "/home/xuhd/FedML/python/examples/simulation/mpi_torch_async_fedavg/torch_fedavg_mnist_lr_custum_data_and_model_example.py", line 40, in simulator = SimulatorMPI(args, device, dataset, model) File "/home/xuhd/FedML/python/fedml/simulation/simulator.py", line 107, in init FedML_FedAvgSeq_distributed( File "/home/xuhd/FedML/python/fedml/simulation/mpi/fedavg_seq/FedAvgSeqAPI.py", line 31, in FedML_FedAvgSeq_distributed init_server( File "/home/xuhd/FedML/python/fedml/simulation/mpi/fedavg_seq/FedAvgSeqAPI.py", line 99, in init_server server_manager.send_init_msg() File "/home/xuhd/FedML/python/fedml/simulation/mpi/fedavg_seq/FedAvgServerManager.py", line 42, in send_init_msg client_schedule = self.aggregator.generate_client_schedule(self.args.round_idx, client_indexes) File "/home/xuhd/FedML/python/fedml/simulation/mpi/fedavg_seq/FedAVGAggregator.py", line 183, in generate_client_schedule client_schedule = np.array_split(client_indexes, self.worker_num) File "/usr/local/lib/python3.10/dist-packages/numpy/lib/shape_base.py", line 770, in array_split raise ValueError('number sections must be larger than 0.') from None ValueError: number sections must be larger than 0." 2 Third one: The file is there but the error occurs "Traceback (most recent call last): File "/home/xuhd/FedML/python/examples/simulation/mpi_torch_async_fedavg/torch_fedavg_mnist_lr_custum_data_and_model_example.py", line 25, in args = fedml.init() File "/home/xuhd/FedML/python/fedml/init.py", line 33, in init args = load_arguments(fedml._global_training_type, fedml._global_comm_backend) File "/home/xuhd/FedML/python/fedml/arguments.py", line 188, in load_arguments args = Arguments(cmd_args, training_type, comm_backend) File "/home/xuhd/FedML/python/fedml/arguments.py", line 74, in init self.get_default_yaml_config(cmd_args, training_type, comm_backend) File "/home/xuhd/FedML/python/fedml/arguments.py", line 129, in get_default_yaml_config configuration = self.load_yaml_config(cmd_args.yaml_config_file) File "/home/xuhd/FedML/python/fedml/arguments.py", line 80, in load_yaml_config with open(yaml_path, "r") as stream: FileNotFoundError: [Errno 2] No such file or directory: 'config/zht_config.yaml\r'" 3 Looking forward to your help.Respect.OTZ.

yaokunxu avatar Jul 16 '23 12:07 yaokunxu