Jointly update for multiple optimizers(schedulers)

kwei · June 16, 2022, 8:07am

Hi,
I want to train two subnetworks with their corresponding optimizers and schedulers.
However, as stated in the document,

If you use multiple optimizers, gradients will be calculated only for the parameters of the current optimizer at each training step.

Thus, I can’t jointly train both networks in one training step. Is there any way to solve this problem? Thanks!

Topic		Replies	Views
Gradient Accumulation with Dual (optimizer, scheduler) Training Trainer	0	479	November 10, 2022
Non-trivial case of optimizers - learning rate schedulers	1	1757	January 8, 2021
Easily skipping optimizers for modular networks implementation help	4	1100	September 7, 2020
`self.lr_schedulers().optimizer` and `self.optimizers()` return different optimizers after resuming training LightningModule	0	213	July 18, 2023
Multiple CPUs do not communicate under the DDP strategy. Trainer	0	281	September 29, 2023