Accumulated Gradients + DDP in Contrastive Learning?

I knew that the contrastive learning can be performed in pytorch lightning DDP mode if SyncFunction class is employed.

However, I would like to employ the accumulated gradients technique and DDP in the contrastive learning, because my GPUs’ memory is still limited.

I think GradCache is one of the possible solutions.

Is it possible to combine GradCache and DDP mode in pytorch lightning ?

hi @Seungyoung_Park, for Accumulated Gradients you could also use accumulate_grad_batches flag in the PyTorch Lightning Trainer. You can check it in the docs here.

