How to sync rouge score between different process?

I’m writing a project about finetuning a sequence generation model. I’m looking for an example about how to gather different generative results on different GPU( one machine multiple GPU ), to calculate a correct rouge score for the whole validation dataset? I know DDP could help me to sync tensor on different devices, but I have no idea how to gather rouge scores on different devices.

you may find useful to take a look at def validation_epoch_end(self, outputs) outputs will contain all outputs from validation_step (list of returned things)

also I guess DP has something like validation_step_end which also gathers all outputs from validation_step across all gpus in your node (for dp or ddp2). Note: validation_epoch_end aggregates outputs from all batches and validation_step_end aggregates outputs from single stem (so if you had batch of 16 validation_step_end will have tuple of 2 elements with 8 sequences in each and if you had 3 batches validation_step_end will have 3 element tuple with 16 sequences in each, if I am not mistaken :slight_smile:)

also I believe that self.all_gather can help you synchronise all tensors across all devices

Maybe you need to combine some of them :slight_smile:
If it wasn’t helpful can you please provide more details on what you are doing, or show some examples?