KubeflowEnvironment
- class lightning.pytorch.plugins.environments.KubeflowEnvironment[source]
Bases:
ClusterEnvironment
Environment for distributed training using the PyTorchJob operator from Kubeflow.
This environment, unlike others, does not get auto-detected and needs to be passed to the Fabric/Trainer constructor manually.
- static detect()[source]
Detects the environment settings corresponding to this cluster and returns
True
if they match.- Return type:
- global_rank()[source]
The rank (index) of the currently running process across all nodes and devices.
- Return type:
- local_rank()[source]
The rank (index) of the currently running process inside of the current node.
- Return type:
- node_rank()[source]
The rank (index) of the node on which the current process runs.
- Return type:
- property creates_processes_externally: bool
Whether the environment creates the subprocesses or not.
- property main_address: str
The main address through which all processes connect and communicate.
- property main_port: int
An open and configured port in the main node through which all processes communicate.