Effective learning rate and batch size with Lightning in DDP

goku · August 31, 2020, 7:30pm

Still confused a bit. So in DDP, backward pass is done on all the devices and later on synced so in this case each device will be using batch_size that will be assigned in the dataloader and learning_rate should be set corresponding to batch_size and not batch_size*N but in case of DP, backward pass is done on batch_size*N on a single device so there should we set learning_rate=learning_rate*N??