FedML icon indicating copy to clipboard operation
FedML copied to clipboard

fail to import FedAvgTrainer for standalone simulation

Open NeutrinoLiu opened this issue 4 years ago • 3 comments

fail at: from fedml_api.standalone.fedavg.fedavg_trainer import FedAvgTrainer in both trainer.py and group.py there is no such fedavg_trainer.py under api/standalone/fedavg/ Then i tried to use the fedavgtrainer under the distributed api, it seems that the factory method from FedAvgTrainer of distributed-api did not match:

Traceback (most recent call last): File "./main.py", line 52, in trainer = Trainer(dataset, model, device, args) TypeError: init() missing 4 required positional arguments: 'train_data_num', 'device', 'args', and 'model_trainer'

NeutrinoLiu avatar Apr 14 '21 07:04 NeutrinoLiu

May I have your script to reproduce this error?

chaoyanghe avatar Apr 15 '21 08:04 chaoyanghe

cd ./fedml_experiments/standalone/hierarchical_fl
sh run_standalone_pytorch.sh 0 1000 1000 -1 mnist ./../../../data/mnist lr hetero 0.03 sgd random 2 5 2 1

just use your CI-script-fedavg.sh to

assert that, for full batch and epochs=1 and when the product of global and group comm. round is fixed,
the accuracy of hierarchical federated learning is equal to that of centralized training, regardless of the number of groups

import failed at group.py and same line for trainer.py.

The error msg i mentioned above occurs when i try to import the FedAvgTrainer from distributed configuration dir where they do have a definition of FedAvgTrainer class.

NeutrinoLiu avatar Apr 15 '21 08:04 NeutrinoLiu

Hello, I have exactly the same issue here, there's no fedavg_trainer module under fedml_api.standalone.fedavg. Do you have an idea of how to fix it? Thanks!

ljn1999 avatar Sep 30 '21 17:09 ljn1999