How to keep track of training time in DDP setting?

Yes, he is right in the sense that logging frequently (on every step) with sync_dist=True is not recommended because it adds an expensive synchronization that slows down your loop. Therefore, one should only add it when necessary, when the slow down is acceptable. In most cases it is not necessary to sync on every step. For example, to compute your average time, you could also just log at the end of the epoch instead of every step.

1 Like