sam issues

Is saving the state by calling .state_dict() sufficient?

2

``` base_optimizer = torch.optim.Adam optimizer = SAM(model.parameters(), base_optimizer, lr=0.1) torch.save({"optz_state_dict":optimizer.state_dict()}, "state.pth") checkpoint = torch.load("state.pth") optimizer.load_state_dict(checkpoint["optz_state_dict"]) ``` By using the above code, the saved state size is more than halved compared...

rtxbae

Any chance for the implementation of the recent Fisher SAM?

2

Link to the paper: https://arxiv.org/abs/2206.04920 Any chance for this implementation in this module?

rtxbae

RuntimeError: stack expects a non-empty TensorList

def _grad_norm(self): shared_device = self.param_groups[0]["params"][0].device # put everything on the same device, in case of model parallelism norm = torch.norm( torch.stack([ ((torch.abs(p) if group["adaptive"] else 1.0) * p.grad).norm(p=2).to(shared_device) for group...

hphyhl

i found it hard to implement this optimizer on yolov5.looking forward to s.b. could do me a FAVOR. THX

3

EveningLin

Torch optimisers expect the step function to run the closure immediately if passed

1

As per title. If the closure is not None, we should assume that the parameter's gradients have not been computed, and immediately run the closure. See, e.g., the step function...

tomogwen

stale

Using the step function with closure

1

Hello, I am trying to use the step function(with the transformers and accelerate library) while passing the closure. The step function has a decorator @torch.no_grad() and thus we specify enable_grad...

mathuryash5

stale

Will Layernorm or Groupnorm cause problems?

As mentioned in Readme, the suggested usage can potentially cause problems if you use batch normalization. Will Layernorm or Groupnorm cause problems in principle? I use SAM in Swintransformer and...

archerli1

How to use SAM with torch.cuda.amp.GradScaler

2

How can you combine SAM with GradScaler and gradient clipping, because you can't unscale twice.

ancestor-mithril

stale

sam
sam copied to clipboard

Metadata

Is saving the state by calling .state_dict() sufficient?

Any chance for the implementation of the recent Fisher SAM?

RuntimeError: stack expects a non-empty TensorList

i found it hard to implement this optimizer on yolov5.looking forward to s.b. could do me a FAVOR. THX

Torch optimisers expect the step function to run the closure immediately if passed

Using the step function with closure

Will Layernorm or Groupnorm cause problems?

How to use SAM with torch.cuda.amp.GradScaler

← Metadata

Owner

Metadata

sam sam copied to clipboard

Metadata

← Metadata

Owner

Metadata

sam
sam copied to clipboard