ClusterEnvironment

class lightning.pytorch.plugins.environments.ClusterEnvironment[source]

Bases: ABC

Specification of a cluster environment.

abstract static detect()[source]

Detects the environment settings corresponding to this cluster and returns True if they match.

Return type:

bool

abstract global_rank()[source]

The rank (index) of the currently running process across all nodes and devices.

Return type:

int

abstract local_rank()[source]

The rank (index) of the currently running process inside of the current node.

Return type:

int

abstract node_rank()[source]

The rank (index) of the node on which the current process runs.

Return type:

int

teardown()[source]

Clean up any state set after execution finishes.

Return type:

None

validate_settings(num_devices, num_nodes)[source]

Validates settings configured in the script against the environment, and raises an exception if there is an inconsistency.

Return type:

None

abstract world_size()[source]

The number of processes across all devices and nodes.

Return type:

int

abstract property creates_processes_externally: bool

Whether the environment creates the subprocesses or not.

abstract property main_address: str

The main address through which all processes connect and communicate.

abstract property main_port: int

An open and configured port in the main node through which all processes communicate.