In the case of DDP:

- The metrics should be calculated in
`validation_step`

or the metrics should be calculated at`validation_step_end`

after gathering output tensors returned by`validation_step`

?- If the metrics are calculated in
`validation_step`

, would be it correct to take the mean of the corresponding metrics in`validation_step_end`

? Considering batch partitions for each device can be uneven? - Does calling
`all_gather`

on the output tensors inside`validation_step_end`

adds an extra dimension before the batch dimension? For example, if my original batch tensor is of the shape`N x C x H x W`

and 2 GPUs are in use then after`all_gather`

the tensor will be of the shape`2 x M x C x H x W`

(where`2M = N`

)? What happens if the batch size (`N`

) is an odd number?

- If the metrics are calculated in