How to sync rouge score between different process?

PaleRunner · June 10, 2021, 8:11pm

I’m writing a project about finetuning a sequence generation model. I’m looking for an example about how to gather different generative results on different GPU( one machine multiple GPU ), to calculate a correct rouge score for the whole validation dataset? I know DDP could help me to sync tensor on different devices, but I have no idea how to gather rouge scores on different devices.

Zhylkaaa · October 10, 2021, 6:40pm

you may find useful to take a look at def validation_epoch_end(self, outputs) outputs will contain all outputs from validation_step (list of returned things)

also I guess DP has something like validation_step_end which also gathers all outputs from validation_step across all gpus in your node (for dp or ddp2). Note: validation_epoch_end aggregates outputs from all batches and validation_step_end aggregates outputs from single stem (so if you had batch of 16 validation_step_end will have tuple of 2 elements with 8 sequences in each and if you had 3 batches validation_step_end will have 3 element tuple with 16 sequences in each, if I am not mistaken )

also I believe that self.all_gather can help you synchronise all tensors across all devices

Maybe you need to combine some of them
If it wasn’t helpful can you please provide more details on what you are doing, or show some examples?

Topic		Replies	Views
Correct approach to calculate metrics in DDP setting DDP/GPU	1	1997	April 4, 2022
Proper way to log things when using DDP	0	2217	March 12, 2021
Multi-GPU, TorchMetrics, incorrect aggregation DDP/GPU	0	503	January 24, 2023
Get batch’s datapoints across all GPUs DDP/GPU	2	1065	January 31, 2022
Implement DDP sampling strategy which requires rank? DDP/GPU	1	467	August 2, 2023

How to sync rouge score between different process?

Related topics