Stochastic Weight Averaging

Anjum_Sayed · November 21, 2020, 12:14pm

I’m trying to implement SWA from this guide:

I’ve broken the example up as follows:

This bit goes in the __init__ for the LightningModule:

self.swa_model = AveragedModel(self.net)
self.swa_start = 5

This goes in configure_optimizers:

self.swa_scheduler = SWALR(optimizer, swa_lr=0.05)

And this bit in train_epoch_end:

if self.trainer.current_epoch > self.swa_start:
    self.swa_model.update_parameters(self.net)
    self.swa_scheduler.step()
    torch.optim.swa_utils.update_bn(self.train_dataloader(), self.swa_model)

I’m getting this error when torch.optim.swa_utils.update_bn is called:

RuntimeError: Expected tensor to have CPU Backend, but got tensor with CUDA Backend (while checking arguments for batch_norm_cpu)

I’m guessing I need to define self.swa_model in such a way it gets put onto the correct device

Is there an example somewhere to use SWA with PL? Thanks!

awaelchli · November 21, 2020, 11:12pm

is AveragedModel not a nn.Module child?
It should already be moved to the correct device automatically.

Anjum_Sayed · November 23, 2020, 1:11pm

Thanks for the reply! Yes, you’re right, it is a nn.Module. I printed a parameter from swa_model and it is indeed on the GPU.

The issue was actually the train_dataloader not being on the GPU. There is a device arg in torch.optim.swa_utils.update_bn which fixed the issue.

tchaton · November 23, 2020, 4:12pm

Hey @Anjum_Sayed,

Would you like to create a PR to SWA ? We can help you out !

Best regards,
T.C

Anjum_Sayed · November 24, 2020, 9:29am

Hi @tchaton, what kind of a PR were you thinking of? I think this more user error than an issue with PL. If you have any specific ideas, I’d be happy to help out!

Shing · July 22, 2021, 11:09am

Hi, do you mean you need to do update_bn(..., device="cpu")?
It would be good if lightning can handle gpu automatically as it’s within its design principles

Topic		Replies	Views
How to implement SWA? implementation help	1	1578	January 16, 2023
How to use SWA with a cyclic scheduler Trainer	0	494	May 7, 2023
StochasticWeightAveraging validation logging and checkpoints	2	899	January 16, 2023
Implement SCHEDULER OPTIMIZER in Pytorch Lightning implementation help	0	737	August 28, 2022
CPU / CUDA:0 RuntimeError - Help please! implementation help	1	6821	November 17, 2022

Stochastic Weight Averaging

Related topics