API References¶

Accelerator API¶

`Accelerator`	The Accelerator Base Class.
`CPUAccelerator`	Accelerator for CPU devices.
`GPUAccelerator`	Accelerator for GPU devices.
`HPUAccelerator`	Accelerator for HPU devices.
`IPUAccelerator`	Accelerator for IPUs.
`TPUAccelerator`	Accelerator for TPU devices.

Core API¶

`CheckpointHooks`	Hooks to be used with Checkpointing.
`DataHooks`	Hooks to be used for data related stuff.
`ModelHooks`	Hooks to be used in LightningModule.
`LightningDataModule`	A DataModule standardizes the training, val, test splits, data preparation and transforms.
`LightningModule`
`DeviceDtypeModuleMixin`	Initializes internal Module state, shared by both nn.Module and ScriptModule.
`HyperparametersMixin`
`LightningOptimizer`	This class is used to wrap the user optimizers and handle properly the backward and optimizer_step logic across accelerators, AMP, accumulate_grad_batches.
`ModelIO`

Strategy API¶

`BaguaStrategy`	Strategy for training using the Bagua library, with advanced distributed training algorithms and system optimizations.
`DDP2Strategy`	DDP2 behaves like DP in one node, but synchronization across nodes behaves like in DDP.
`DDPFullyShardedStrategy`	Plugin for Fully Sharded Data Parallel provided by FairScale.
`DDPShardedStrategy`	Optimizer and gradient sharded training provided by FairScale.
`DDPSpawnShardedStrategy`	Optimizer sharded training provided by FairScale.
`DDPSpawnStrategy`	Spawns processes using the `torch.multiprocessing.spawn()` method and joins processes after training finishes.
`DDPStrategy`	Strategy for multi-process single-device training on one or multiple nodes.
`DataParallelStrategy`	Implements data-parallel training in a single process, i.e., the model gets replicated to each device and each gets a split of the data.
`DeepSpeedStrategy`	Provides capabilities to run training using the DeepSpeed library, with training optimizations for large billion parameter models.
`HorovodStrategy`	Plugin for Horovod distributed training integration.
`HPUParallelStrategy`	Strategy for distributed training on multiple HPU devices.
`IPUStrategy`	Plugin for training on IPU devices.
`ParallelStrategy`	Plugin for training with multiple processes in parallel.
`SingleDeviceStrategy`	Strategy that handles communication on a single device.
`SingleHPUStrategy`	Strategy for training on single HPU device.
`SingleTPUStrategy`	Strategy for training on a single TPU device.
`Strategy`	Base class for all strategies that change the behaviour of the training, validation and test- loop.
`TPUSpawnStrategy`	Strategy for training multiple TPU devices using the `torch_xla.distributed.xla_multiprocessing.spawn()` method.

Callbacks API¶

`BackboneFinetuning`	Finetune a backbone model based on a learning rate user-defined scheduling.
`BaseFinetuning`	This class implements the base logic for writing your own Finetuning Callback.
`BasePredictionWriter`	Base class to implement how the predictions should be stored.
`Callback`	Abstract base class used to build new callbacks.
`DeviceStatsMonitor`	Automatically monitors and logs device stats during training stage.
`EarlyStopping`	Monitor a metric and stop training when it stops improving.
`GPUStatsMonitor`	Deprecated since version v1.5.
`GradientAccumulationScheduler`	Change gradient accumulation factor according to scheduling.
`LambdaCallback`	Create a simple callback on the fly using lambda functions.
`LearningRateMonitor`	Automatically monitor and logs learning rate for learning rate schedulers during training.
`ModelCheckpoint`	Save the model periodically by monitoring a quantity.
`ModelPruning`	Model pruning Callback, using PyTorch's prune utilities.
`ModelSummary`	Generates a summary of all layers in a `LightningModule`.
`ProgressBarBase`	The base class for progress bars in Lightning.
`QuantizationAwareTraining`	Quantization allows speeding up inference and decreasing memory requirements by performing computations and storing tensors at lower bitwidths (such as INT8 or FLOAT16) than floating point precision.
`RichModelSummary`	Generates a summary of all layers in a `LightningModule` with rich text formatting.
`RichProgressBar`	Create a progress bar with rich text formatting.
`StochasticWeightAveraging`	Implements the Stochastic Weight Averaging (SWA) Callback to average a model.
`Timer`	The Timer callback tracks the time spent in the training, validation, and test loops and interrupts the Trainer if the given time limit for the training loop is reached.
`TQDMProgressBar`	This is the default progress bar used by Lightning.
`XLAStatsMonitor`	Deprecated since version v1.5.

Loggers API¶

`base`	Abstract base class used to build new loggers.
`comet`	Comet Logger
`csv_logs`	CSV logger
`mlflow`	MLflow Logger
`neptune`	Neptune Logger
`tensorboard`	TensorBoard Logger
`test_tube`	Test Tube Logger
`wandb`	Weights and Biases Logger

Loop API¶

Base Classes¶

`DataLoaderLoop`	Base class to loop over all dataloaders.
`Loop`	Basic Loops interface.

Default Loop Implementations¶

Training¶

`TrainingBatchLoop`	Runs over a single batch of data.
`TrainingEpochLoop`	Runs over all batches in a dataloader (one epoch).
`FitLoop`	This Loop iterates over the epochs to run the training.
`ManualOptimization`	A special loop implementing what is known in Lightning as Manual Optimization where the optimization happens entirely in the `training_step()` and therefore the user is responsible for back-propagating gradients and making calls to the optimizers.
`OptimizerLoop`	Runs over a sequence of optimizers.

Validation and Testing¶

`EvaluationEpochLoop`	This is the loop performing the evaluation.
`EvaluationLoop`	Loops over all dataloaders for evaluation.

Prediction¶

`PredictionEpochLoop`	Loop performing prediction on arbitrary sequentially used dataloaders.
`PredictionLoop`	Loop to run over dataloaders for prediction.

Plugins API¶

Precision Plugins¶

`ApexMixedPrecisionPlugin`	Mixed Precision Plugin based on Nvidia/Apex (https://github.com/NVIDIA/apex)
`DeepSpeedPrecisionPlugin`	Precision plugin for DeepSpeed integration.
`DoublePrecisionPlugin`	Plugin for training with double (`torch.float64`) precision.
`FullyShardedNativeMixedPrecisionPlugin`	Native AMP for Fully Sharded Training.
`HPUPrecisionPlugin`	Plugin that enables bfloat/half support on HPUs.
`IPUPrecisionPlugin`	Precision plugin for IPU integration.
`MixedPrecisionPlugin`	Base Class for mixed precision.
`NativeMixedPrecisionPlugin`	Plugin for Native Mixed Precision (AMP) training with `torch.autocast`.
`PrecisionPlugin`	Base class for all plugins handling the precision-specific parts of the training.
`ShardedNativeMixedPrecisionPlugin`	Native AMP for Sharded Training.
`TPUBf16PrecisionPlugin`	Plugin that enables bfloats on TPUs.
`TPUPrecisionPlugin`	Precision plugin for TPU integration.

Cluster Environments¶

`ClusterEnvironment`	Specification of a cluster environment.
`KubeflowEnvironment`	Environment for distributed training using the PyTorchJob operator from Kubeflow
`LightningEnvironment`	The default environment used by Lightning for a single node or free cluster (not managed).
`LSFEnvironment`	An environment for running on clusters managed by the LSF resource manager.
`SLURMEnvironment`	Cluster environment for training on a cluster managed by SLURM.
`TorchElasticEnvironment`	Environment for fault-tolerant and elastic training with torchelastic

Checkpoint IO Plugins¶

`CheckpointIO`	Interface to save/load checkpoints as they are saved through the `Strategy`.
`HPUCheckpointIO`	CheckpointIO to save checkpoints for HPU training strategies.
`TorchCheckpointIO`	CheckpointIO that utilizes `torch.save()` and `torch.load()` to save and load checkpoints respectively, common for most use cases.
`XLACheckpointIO`	CheckpointIO that utilizes `xm.save()` to save checkpoints for TPU training strategies.

Other Plugins¶

`LayerSync`	Abstract base class for creating plugins that wrap layers of a model with synchronization logic for multiprocessing.
`NativeSyncBatchNorm`	A plugin that wraps all batch normalization layers of a model with synchronization logic for multiprocessing.

Profiler API¶

`AdvancedProfiler`	This profiler uses Python's cProfiler to record more detailed information about time spent in each function call recorded during a given action.
`PassThroughProfiler`	This class should be used when you don't want the (small) overhead of profiling.
`Profiler`	If you wish to write a custom profiler, you should inherit from this class.
`PyTorchProfiler`	This profiler uses PyTorch's Autograd Profiler and lets you inspect the cost of.
`SimpleProfiler`	This profiler simply records the duration of actions (in seconds) and reports the mean duration of each action and the total time spent over the entire training run.
`XLAProfiler`	XLA Profiler will help you debug and optimize training workload performance for your models using Cloud TPU performance tools.

Trainer API¶

Trainer

Customize every aspect of training via flags.

LightningLite API¶

LightningLite

Lite accelerates your PyTorch training or inference code with minimal changes required.

Tuner API¶

Tuner

Tuner class to tune your model.

Utilities API¶

`apply_func`	Utilities used for collections.
`argparse`	Utilities for Argument Parsing within Lightning Components.
`cli`	Utilities for LightningCLI.
`cloud_io`	Utilities related to data saving/loading.
`deepspeed`	Utilities that can be used with Deepspeed.
`distributed`	Utilities that can be used with distributed training.
`finite_checks`	Helper functions to detect NaN/Inf values.
`memory`	Utilities related to memory.
`model_summary`	Utilities related to model weights summary.
`optimizer`
`parsing`	Utilities used for parameter parsing.
`rank_zero`	Utilities that can be used for calling functions on a particular rank.
`seed`	Utilities to help with reproducibility of models.
`warnings`	Warning-related utilities.