I knew that the contrastive learning can be performed in pytorch lightning DDP mode if SyncFunction class is employed.
However, I would like to employ the accumulated gradients technique and DDP in the contrastive learning, because my GPUs’ memory is still limited.
I think GradCache is one of the possible solutions.
Is it possible to combine GradCache and DDP mode in pytorch lightning ?