thunder.plugins.DDP¶
- class thunder.plugins.DDP(bucket_size_in_mb=25.0, broadcast_from=None, process_group=None)[source]¶
Bases:
Plugin
Plugin for enabling Distributed Data Parallel (DDP) training in Thunder.
This plugin applies the necessary transforms to bucket and synchronize gradients across multiple processes, using a specified process group for communication.
See https://github.com/pytorch/pytorch/blob/v2.7.0/torch/nn/parallel/distributed.py#L326 for more details.
- Parameters:
bucket_size_in_mb¶ (
float
) – float, default 25.0 Size in megabytes of the gradient bucket in DDP.broadcast_from¶ (
Optional
[int
]) – int | None, default None Global rank ID to broadcast model parameters from at initialization. If None, no explicit broadcast is performed.process_group¶ – Optional[ProcessGroup], default is the current default process group
bucket_size_in_mb (float) –
broadcast_from (int | None) –
Methods
__init__
([bucket_size_in_mb, ...])setup_executors
()setup_lookasides
()Constructs the list of graph-level transforms.
Attributes
policy