accelerate accelerator.prepare just can be run jus once ?

System Info

- `Accelerate` version: 0.28.0
- Platform: Linux-5.4.250-2-velinux1u1-amd64-x86_64-with-glibc2.29
- Python version: 3.8.10
- Numpy version: 1.21.0
- PyTorch version (GPU?): 2.2.0+cu118 (True)
- PyTorch XPU available: False
- PyTorch NPU available: False
- System RAM: 2015.16 GB
- GPU type: NVIDIA A100-SXM4-80GB
- `Accelerate` default config:
	Not found

Information

[ ] The official example scripts
[X] My own modified scripts

Tasks

[ ] One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
[X] My own task or dataset (give details below)

Reproduction

I write the code with accelerator.prepare more than once:

model, optimizer, train_dataloader,eval_dataloader= accelerator.prepare(
            model, optimizer, train_dataloader,eval_dataloader)
lr_scheduler= accelerator.prepare( lr_scheduler)

lr_scheduler.step() running is different with prepare once . with once, with accelerator.accumulate(model): ,the lr_scheduler.step() will run num_processes times every step ,see code. with twice prepare and lr_scheduler prepare after.with accelerator.accumulate(model): ,the lr_scheduler.step() will run once every step.

with 2nd prepare with lr_scheduler,Is there some difference with with accelerator.accumulate(model):?

Expected behavior

The reasons for differences in coding type.

Jun 21 '24 14:06 DavideHe

Hi @DavideHe, thanks for raising the issue. could you share a minimal reproducer ? The lr_scheduler should behave the same when if the lr_scheduler is in the 2nd prepare. However, we expect the user to only use prepare once. What is the behavior you were expecting ? With accelerator.accumulate(model), the lr_scheduler is should be updated after every gradient_accumulation_steps iteration. See related issue https://github.com/huggingface/accelerate/issues/963

Jun 25 '24 13:06 SunMarc

prepare twice

model, optimizer, train_dataloader,eval_dataloader= accelerator.prepare(
            model, optimizer, train_dataloader,eval_dataloader)
lr_scheduler= accelerator.prepare( lr_scheduler)
for data in train_dataloader:
    with accelerator.accumulate(model):
        lr_scheduler.step()
        print(lr_scheduler.get_last_lr()[-1])

as the code above, the lr will update every step when gradient_accumulation_steps > 1. But prepare once , lr will update every gradient_accumulation_steps step.

Jun 27 '24 11:06 DavideHe

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Jul 21 '24 15:07 github-actions[bot]