Why `num_replica` != `world_size`?

Hi,
Why in fabric ddp (and fsdp) the num_replica for data loaders is set to be
num_nodes * num_processes
Instead of simply world_size (like in deepspeed)?

I am running on a cluster where I do not get the same number of GPUs on all nodes for a specific job. This causes fabric.setup_dataloaders to fail.

related