API Reference¶
Fabric¶
Fabric accelerates your PyTorch training or inference code with minimal changes required. |
Accelerators¶
The Accelerator base class. |
|
Accelerator for CPU devices. |
|
Accelerator for NVIDIA CUDA devices. |
|
Accelerator for Metal Apple Silicon GPU devices. |
|
Accelerator for TPU devices. |
Loggers¶
Base class for experiment loggers. |
|
Log to the local file system in CSV format. |
|
Log to local file system in TensorBoard format. |
Plugins¶
Precision¶
Base class for all plugins handling the precision-specific parts of the training. |
|
Plugin for training with double ( |
|
Plugin for Automatic Mixed Precision (AMP) training with |
|
Precision plugin for TPU integration. |
|
Plugin that enables bfloats on TPUs. |
|
AMP for Fully Sharded Data Parallel training. |
Environments¶
Specification of a cluster environment. |
|
Environment for distributed training using the PyTorchJob operator from Kubeflow |
|
The default environment used by Lightning for a single node or free cluster (not managed). |
|
An environment for running on clusters managed by the LSF resource manager. |
|
Cluster environment for training on a cluster managed by SLURM. |
|
Environment for fault-tolerant and elastic training with torchelastic |
|
Cluster environment for training on a TPU Pod with the PyTorch/XLA library. |
IO¶
Interface to save/load checkpoints as they are saved through the |
|
CheckpointIO that utilizes |
|
CheckpointIO that utilizes |
Collectives¶
Interface for collective operations. |
|
Strategies¶
Base class for all strategies that change the behaviour of the training, validation and test- loop. |
|
Strategy for multi-process single-device training on one or multiple nodes. |
|
Implements data-parallel training in a single process, i.e., the model gets replicated to each device and each gets a split of the data. |
|
Strategy for Fully Sharded Data Parallel provided by torch.distributed. |
|
Strategy for training with multiple processes in parallel. |
|
Strategy that handles communication on a single device. |
|
Strategy for training on a single TPU device. |