Accumulated Gradients + DDP in Contrastive Learning?

I knew that the contrastive learning can be performed in pytorch lightning DDP mode if SyncFunction class is employed.

However, I would like to employ the accumulated gradients technique and DDP in the contrastive learning, because my GPUs’ memory is still limited.

I think GradCache is one of the possible solutions.

Is it possible to combine GradCache and DDP mode in pytorch lightning ?

hi @Seungyoung_Park, for Accumulated Gradients you could also use accumulate_grad_batches flag in the PyTorch Lightning Trainer. You can check it in the docs here.

Also, we have migrated to Github Discussions. To get quicker response please post your questions there.

Thanks :slight_smile: