DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

Deepspeed support CosineAnnealingLR scheduler

Open hahchenchen opened this issue 2 years ago • 4 comments

Now Deepspeed do not support CosineAnnealingLR scheduler.

So i want to support it by meself. question is how to develop a custom scheduler, Are there any tutorials available?

hahchenchen avatar May 08 '23 14:05 hahchenchen

The same question for me

DavdGao avatar Jun 25 '23 08:06 DavdGao

@hahchenchen and @DavdGao, unfortunately we don't have a tutorial for this. However, there are two options available for this issue:

  1. You can directly pass the torch implementation into deepspeed.initialize() as documented here.
  2. You can implement directly in deepspeed by following existing custom implementations, such as WarmupLR or OneCycle.

tjruwase avatar Aug 14 '23 14:08 tjruwase

Hi @tjruwase ,

You can directly pass the torch implementation into deepspeed.initialize() as documented here.

For this option, do you mean like the following?

model, optimizer, _, lr_scheduler = deepspeed.initialize(
        model=model,
        args=args,
        lr_scheduler=torch.optim.lr_scheduler.CosineAnnealingLR,
        config_params=ds_config
    )

If yes, there is a T_max argument in CosineAnnealingLR that is required. How to pass this argument?

HsuWanTing avatar May 10 '24 10:05 HsuWanTing

@HsuWanTing, you can also pass lr scheduler as a Callable, which should work for your case. Please see the following example https://github.com/microsoft/DeepSpeed/blob/3dd7ccff8103be60c31d963dd2278d43abb68fd1/tests/unit/runtime/test_ds_initialize.py#L254

tjruwase avatar May 10 '24 13:05 tjruwase