TorchElasticEnvironment
- class lightning.pytorch.plugins.environments.TorchElasticEnvironment[source]
Bases:
ClusterEnvironment
Environment for fault-tolerant and elastic training with torchelastic
- static detect()[source]
Returns
True
if the current process was launched using the torchelastic command.- Return type:
- global_rank()[source]
The rank (index) of the currently running process across all nodes and devices.
- Return type:
- local_rank()[source]
The rank (index) of the currently running process inside of the current node.
- Return type:
- node_rank()[source]
The rank (index) of the node on which the current process runs.
- Return type:
- validate_settings(num_devices, num_nodes)[source]
Validates settings configured in the script against the environment, and raises an exception if there is an inconsistency.
- Return type:
- property creates_processes_externally: bool
Whether the environment creates the subprocesses or not.
- property main_address: str
The main address through which all processes connect and communicate.
- property main_port: int
An open and configured port in the main node through which all processes communicate.