Common use cases

Learn to train Lightning models on the cloud

Lightning checkpoints have everything you need to save and restore your models

Learn to train on your university or company's cluster

Tricks for debugging your Lightning Models

Save time and money by training until key metrics stop improving or time has elapsed

Here you'll find the latest SOTA training techniques such as SWA, accumulated gradients, etc...

Avoid over-fitting (memorizing the dataset) with these techniques

Before coding a complex model, use lightning-flash to create a baseline in a few lines of code

Enable fault-tolerant training in clusters/clouds where machines might fail (ie: pre-emtible machines)

Make your models more flexible by enabling command-line arguments

Use the latest tricks to easily productionize your Lightning models

Reduce configuration boilerplate with the Lightning CLI

Visualize your machine learning experiments with these experiment managers

Use the model registry to mix and match your models and Datamodules

Train 1TB+ parameter models with these advanced built-in techniques

Increase batch-sizes and improve speeds by training using 16-bit precision and more

Enable manual optimization to fully control the optimization procedure for advanced research

Use these profilers to find bottlenecks in your model

Use these built-in progress bars or learn how to make your own!

Compress model sizes to speed up model inference for deployment without loss of performance (accuracy)

Work with data on any local or cloud filesystem such as S3 on AWS, GCS on Google Cloud, or ADL on Azure

Building the next Deepspeed, FSDP or fancy scaling technique? Add them to Lightning here

Simplify metrics calculations to scale-proof your models

Use models training on large datasets to achieve better results when you don't have much data