Multi-GPU, TorchMetrics, incorrect aggregation

I am training on multiple GPUs with DDP. During validation, I would like to compute some statistics on the entire validation data: specifically, torchmetrics.PearsonCorrCoef. the function is non-linear, so computing it separately on the parts of the data visible to each GPU and then averaging won’t work: I need to input all predictions and labels to the function at once. Looking online, the recommendation seems to be to use torchmetrics, but I could not understand from the documentation how to proceed. I was hoping someone here may have an explanation or a good reference. Thank you!