API Reference¶
Fabric¶
Fabric accelerates your PyTorch training or inference code with minimal changes required.  | 
Accelerators¶
The Accelerator base class.  | 
|
Accelerator for CPU devices.  | 
|
Accelerator for NVIDIA CUDA devices.  | 
|
Accelerator for Metal Apple Silicon GPU devices.  | 
|
Accelerator for TPU devices.  | 
Loggers¶
Base class for experiment loggers.  | 
|
Log to the local file system in CSV format.  | 
|
Log to local file system in TensorBoard format.  | 
Plugins¶
Precision¶
Base class for all plugins handling the precision-specific parts of the training.  | 
|
Plugin for training with double (  | 
|
Plugin for Automatic Mixed Precision (AMP) training with   | 
|
Precision plugin for TPU integration.  | 
|
Plugin that enables bfloats on TPUs.  | 
|
AMP for Fully Sharded Data Parallel training.  | 
Environments¶
Specification of a cluster environment.  | 
|
Environment for distributed training using the PyTorchJob operator from Kubeflow  | 
|
The default environment used by Lightning for a single node or free cluster (not managed).  | 
|
An environment for running on clusters managed by the LSF resource manager.  | 
|
Cluster environment for training on a cluster managed by SLURM.  | 
|
Environment for fault-tolerant and elastic training with torchelastic  | 
|
Cluster environment for training on a TPU Pod with the PyTorch/XLA library.  | 
IO¶
Interface to save/load checkpoints as they are saved through the   | 
|
CheckpointIO that utilizes   | 
|
CheckpointIO that utilizes   | 
Collectives¶
Interface for collective operations.  | 
|
Strategies¶
Base class for all strategies that change the behaviour of the training, validation and test- loop.  | 
|
Strategy for multi-process single-device training on one or multiple nodes.  | 
|
Implements data-parallel training in a single process, i.e., the model gets replicated to each device and each gets a split of the data.  | 
|
Strategy for Fully Sharded Data Parallel provided by torch.distributed.  | 
|
Strategy for training with multiple processes in parallel.  | 
|
Strategy that handles communication on a single device.  | 
|
Strategy for training on a single TPU device.  |