Common use casesΒΆ Cloud Training Learn to train Lightning models on the cloud Checkpointing Lightning checkpoints have everything you need to save and restore your models Cluster Training Learn to train on your university or company's cluster Debugging Tricks for debugging your Lightning Models Early Stopping Save time and money by training until key metrics stop improving or time has elapsed Effective Training Techniques Here you'll find the latest SOTA training techniques such as SWA, accumulated gradients, etc... Evaluation Avoid over-fitting (memorizing the dataset) with these techniques Fast Baselines Before coding a complex model, use lightning-flash to create a baseline in a few lines of code Fault-Tolerant Training Enable fault-tolerant training in clusters/clouds where machines might fail (ie: pre-emtible machines) Hyperparameters (via command-line) Make your models more flexible by enabling command-line arguments Inference in Production Use the latest tricks to easily productionize your Lightning models Lightning CLI Reduce configuration boilerplate with the Lightning CLI Loggers (experiment managers) Visualize your machine learning experiments with these experiment managers Model and Datamodule Registry Use the model registry to mix and match your models and Datamodules Model Parallelism Train 1TB+ parameter models with these advanced built-in techniques N-Bit Precision Increase batch-sizes and improve speeds by training using 16-bit precision and more Manual Optimization Enable manual optimization to fully control the optimization procedure for advanced research Profiling Use these profilers to find bottlenecks in your model Progress Bar Use these built-in progress bars or learn how to make your own! Pruning and Quantization Compress model sizes to speed up model inference for deployment without loss of performance (accuracy) Remote Filesystems Work with data on any local or cloud filesystem such as S3 on AWS, GCS on Google Cloud, or ADL on Azure Strategy Registry Building the next Deepspeed, FSDP or fancy scaling technique? Add them to Lightning here Torchmetrics Simplify metrics calculations to scale-proof your models Transfer Learning (finetuning) Use models training on large datasets to achieve better results when you don't have much data