{"id":5649104,"date":"2023-10-19T05:09:34","date_gmt":"2023-10-19T09:09:34","guid":{"rendered":"https:\/\/lightning.ai\/pages\/?p=5649104"},"modified":"2023-10-23T07:55:28","modified_gmt":"2023-10-23T11:55:28","slug":"step-by-step-walk-through-of-pytorch-lightning","status":"publish","type":"post","link":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/","title":{"rendered":"Step-By-Step Walk-Through of Pytorch Lightning"},"content":{"rendered":"<div class=\"takeaways card-glow p-4 my-4\"><h3 class=\"w-100 d-block\">Takeaways<\/h3>Learn step-by-step how to train a Convolutional Neural Network for Image Classification on CIFAR-10 dataset using PyTorch Lightning with callbacks and loggers for monitoring model performance.<\/div>\n<p>In this blog, you will learn about the different components of PyTorch Lightning and how to train an image classifier on the CIFAR-10 dataset with PyTorch Lightning. We will also discuss how to use loggers and callbacks like Tensorboard, ModelCheckpoint, etc.<\/p>\n<p>PyTorch Lightning is a high-level wrapper over PyTorch which makes model training easier and scalable by removing all the boilerplates so that you can focus more on the experiments and research than engineering the model training process. PyTorch Lightning is a great way to start with deep learning for beginners as well as for experts who want to scale their training to billion+ parameter models like Llama and Stable Diffusion.<\/p>\n<p>We will begin by acquainting ourselves with the key components of PyTorch Lightning, subsequently utilizing this knowledge to train an image classification model. Additionally, we will document our experiments using a logger such as Tensorboard to monitor and visualize the metrics. You can access the code used for this blog <a href=\"https:\/\/github.com\/aniketmaurya\/deep-learning-examples\/blob\/main\/src\/pytorch_lightning\/image-classification\/cifar10.ipynb\">here<\/a>.<\/p>\n<h2>Components of PyTorch Lightning<\/h2>\n<p><video width=\"1920\" height=\"920\" style=\"max-width:100%;height:auto;width:100%\" autoplay muted loop><source src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk.mp4\" type=\"video\/mp4\"><\/video><\/p>\n<p>PyTorch Lightning consists of two primary components: <code><strong>LightningModule<\/strong><\/code>, and <code><strong>Trainer<\/strong><\/code>. These modules play a crucial role in organizing and automating various aspects and phases of the model training lifecycle. Let&#8217;s delve into each of them step by step. \u26a1<\/p>\n<h3>LightningModule &#8211; Organizes the Training Loop<\/h3>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5649123\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-lit-module.png\" alt=\"\" width=\"2560\" height=\"1178\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-lit-module.png 2560w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-lit-module-300x138.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-lit-module-1024x471.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-lit-module-1536x707.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-lit-module-2048x942.png 2048w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-lit-module-300x138@2x.png 600w\" sizes=\"(max-width: 2560px) 100vw, 2560px\" \/><\/p>\n<p><code><a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\/common\/lightning_module.html\"><strong>LightningModule<\/strong><\/a><\/code> contains all the logic for model initialization, training\/validation steps, and the calculation of loss and accuracy metrics. It organizes the PyTorch code into six sections:<\/p>\n<p><strong>The LightningModule comprises<\/strong><\/p>\n<ul>\n<li>Initialization (<code>__init__<\/code>\u00a0and\u00a0<code>setup()<\/code>)<\/li>\n<li>Train logic (<code><a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\/api\/lightning.pytorch.core.LightningModule.html#lightning.pytorch.core.LightningModule.training_step\">training_step()<\/a><\/code>)<\/li>\n<li>Validation loop (<code><a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\/api\/lightning.pytorch.core.LightningModule.html#lightning.pytorch.core.LightningModule.validation_step\">validation_step()<\/a><\/code>)<\/li>\n<li>Test logic (<code><a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\/api\/lightning.pytorch.core.LightningModule.html#lightning.pytorch.core.LightningModule.test_step\">test_step()<\/a><\/code>)<\/li>\n<li>Prediction logic (<code><a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\/api\/lightning.pytorch.core.LightningModule.html#lightning.pytorch.core.LightningModule.predict_step\">predict_step()<\/a><\/code>)<\/li>\n<li>Optimizers and LR Schedulers (<code><a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\/api\/lightning.pytorch.core.LightningModule.html#lightning.pytorch.core.LightningModule.configure_optimizers\">configure_optimizers()<\/a><\/code>)<\/li>\n<\/ul>\n<p>The example below shows a sample implementation of the <code>LightningModule<\/code>.<\/p>\n<pre class=\"hljs collapse-false language-python\"><code class=\"language-python\">import torch\r\nimport torch.nn as nn\r\nimport pytorch_lightning as pl\r\n\r\nclass MyLitModel(pl.LightningModule):\r\n    def __init__(self):\r\n        super().__init__()\r\n        self.model = load_model(...)\r\n\tself.loss_fn = nn.CrossEntropyLoss()\r\n\r\n    def forward(self, x):\r\n        return self.model(x)\r\n    \r\n    def training_step(self, batch, batch_idx):\r\n        x, y = batch\r\n        logits = self(x)\r\n        loss = self.loss_fn(logits, y)\r\n        self.log('train_loss', loss)\r\n        return loss\r\n\r\n    def configure_optimizers(self):\r\n        return torch.optim.Adam(self.parameters(), lr=0.001)\r\n<\/code><\/pre>\n<p>In this sample example, the model is initialized with the <strong><code>__init__<\/code><\/strong> method, and we define the <strong><code>training_step<\/code><\/strong>, which takes the <strong><code>batch<\/code><\/strong> and <strong><code>batch_idx<\/code><\/strong> arguments. We separate the inputs <strong><code>x<\/code><\/strong> and labels <strong><code>y<\/code><\/strong> from the batch, pass the inputs through the model, and calculate the cross-entropy loss. PyTorch Lightning will automatically call <strong><code>loss.backward()<\/code><\/strong> and update the Adam optimizer that we have defined in the <strong><code>configure_optimizers<\/code><\/strong> method. You don\u2019t need to manually move the tensors from the CPU to the GPU; Lightning \u26a1 will take care of that for you.<\/p>\n<h2>Lightning Trainer &#8211; Automating the Training Process<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5649107\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walkthrough-trainer.png\" alt=\"\" width=\"2000\" height=\"959\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walkthrough-trainer.png 2000w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walkthrough-trainer-300x144.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walkthrough-trainer-1024x491.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walkthrough-trainer-1536x737.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walkthrough-trainer-300x144@2x.png 600w\" sizes=\"(max-width: 2000px) 100vw, 2000px\" \/><\/p>\n<p>Once we have organized our training code with the <code>LightningModule<\/code> and loaded the dataset, we are all set to begin the training process using the Lightning Trainer.<\/p>\n<p>It simplifies mixed precision training, GPU selection, setting the number of devices, distributed training, and much more. The <code><a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\/common\/trainer.html#trainer-flags\">Trainer<\/a><\/code><a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\/common\/trainer.html#trainer-flags\"> class<\/a>\u00a0has 35 flags (at the time of writing this blog) that can be used for various tasks, ranging from defining the number of epochs to scaling the training for <a href=\"https:\/\/lightning.ai\/docs\/fabric\/latest\/advanced\/model_parallel\/fsdp.html\">models with billions of parameters<\/a>.<\/p>\n<p>&nbsp;<\/p>\n<div class=\"perfect-pullquote vcard pullquote-align-full pullquote-border-placement-left\"><blockquote><p>\ud83d\udca1 PyTorch Lightning also offers <code><strong>LightningDataModule<\/strong><\/code> that can be used to organize the PyTorch <a href=\"https:\/\/pytorch.org\/tutorials\/beginner\/basics\/data_tutorial.html\">dataset and dataloaders<\/a>. It also automates the data loading in a distributed training environment. In this blog, we won\u2019t discuss datamodules as you can also use the <code><a href=\"https:\/\/pytorch.org\/docs\/stable\/data.html\">DataLoader<\/a><\/code> directly but I would encourage the readers to read from the official docs <a href=\"https:\/\/lightning.ai\/docs\/pytorch\/latest\/data\/datamodule.html\">here<\/a>.<\/p><\/blockquote><\/div>\n<p>Let&#8217;s explore how to use the Lightning Trainer with a LightningModule and go through a few of the flags using the example below. We create a Lightning Trainer object with 4 GPUs, perform <a href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/accelerating-large-language-models-with-mixed-precision-techniques\/\">mixed-precision training<\/a> with the float16 data type, and finally train the <strong><code>MyLitModel<\/code><\/strong> model that we defined in the previous section. Finally, we initiate the training by providing the model and dataloaders to the <strong><code>trainer.fit<\/code><\/strong> method.<\/p>\n<pre class=\"hljs collapse-false language-python\"><code class=\"language-python\">trainer = pl.Trainer(\r\n\tdevices=4,\r\n\taccelerator=\"gpu\",\r\n\tprecision=\"fp16-mixed\",\r\n\t)\r\n\r\nmodel = MyLitModel()\r\ntrainer.fit(model, train_dataloder=train_dataloder)\r\n<\/code><\/pre>\n<h3><strong>Loggers and Callbacks<\/strong><\/h3>\n<p>You can also add a logger, such as Tensorboard, WandB, Comet, or a simple CSVLogger, to monitor the loss or any other metrics that you&#8217;ve logged during training. For simplicity, we will use Tensorboard in this blog. You can just import the <code>TensorBoardLogger<\/code> and add it to the Trainer as shown below:<\/p>\n<pre class=\"hljs collapse-false language-python\"><code class=\"language-python\">from pytorch_lightning.loggers import TensorBoardLogger\r\n\r\ntrainer = pl.Trainer(logger=TensorBoardLogger(save_dir=\"logs\/\"))\r\n\r\ntrainer.fit(model, train_dataloader, val_dataloader)\r\n<\/code><\/pre>\n<p>To start the Tensorboard web UI, run the command <code>tensorboard --logdir logs\/<\/code> from your terminal, and it will launch the Tensorboard UI on the default port 6006.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5649108\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walkthrough-tb1.png\" alt=\"\" width=\"2000\" height=\"1114\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walkthrough-tb1.png 2000w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walkthrough-tb1-300x167.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walkthrough-tb1-1024x570.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walkthrough-tb1-1536x856.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walkthrough-tb1-300x167@2x.png 600w\" sizes=\"(max-width: 2000px) 100vw, 2000px\" \/><\/p>\n<p>PyTorch Lightning provides several built-in <a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\/extensions\/callbacks.html#built-in-callbacks\">callbacks<\/a>, such as <code>BatchSizeFinder<\/code>, <code>EarlyStopping<\/code>, <code>ModelCheckpoint<\/code>, and more. These callbacks offer valuable additional functionality to manage and manipulate training at various stages of the loop. In this blog, we will use the <a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\/api\/lightning.pytorch.callbacks.EarlyStopping.html#lightning.pytorch.callbacks.EarlyStopping\">EarlyStopping<\/a> callback to automatically stop our training once the monitored metric (e.g., validation loss) stops improving. You can also configure other arguments, such as <code>patience<\/code>, to determine the number of checks before training should stop.<\/p>\n<pre class=\"hljs collapse-false language-python\"><code class=\"language-python\">from pytorch_lightning.callbacks import EarlyStopping\r\n\r\nearly_stopping = EarlyStopping('val_loss', patience=7)\r\ntrainer = pl.Trainer(callbacks=early_stopping)\r\ntrainer.fit(model, train_dataloader, val_dataloader)\r\n<\/code><\/pre>\n<h2>Training an Image Classifier (Convolutional Neural Networks or CNN) on the CIFAR-10 dataset using PyTorch Lightning<\/h2>\n<p>Now that we have learned about LightningModule and Trainer, we will proceed to train an image classification model using the CIFAR-10 dataset. We will begin by loading the dataset from torchvision, defining the model, training_step, and validation_step, and then create the Trainer by configuring devices and loggers to monitor our training.<\/p>\n<h3>Create Dataset<\/h3>\n<p>We use torchvision library for loading the training and validation data. We will use <code>torchvision.transforms<\/code> module to convert our images to Pytorch\u2019s Tensor and normalize the image pixels.<\/p>\n<pre class=\"hljs collapse-false language-python\"><code class=\"language-python\">import torch\r\nimport torchvision\r\nimport torchvision.transforms as transforms\r\nimport os\r\n\r\nbatch_size = 8\r\nNUM_WORKERS = int(os.cpu_count() \/ 2)\r\n\r\ntransform = transforms.Compose(\r\n    [transforms.ToTensor(),\r\n     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])\r\n\r\ntrainset = torchvision.datasets.CIFAR10(root='.\/data', train=True,\r\n                                        download=True, transform=transform)\r\ntrain_dataloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,\r\n                                          shuffle=True, num_workers=NUM_WORKERS)\r\n\r\nvalset = torchvision.datasets.CIFAR10(root='.\/data', train=False,\r\n                                       download=True, transform=transform)\r\nval_dataloader = torch.utils.data.DataLoader(valset, batch_size=batch_size,\r\n                                         shuffle=False, num_workers=NUM_WORKERS)\r\n<\/code><\/pre>\n<p>It&#8217;s a good idea to visualize your dataset before training. Our dataset contains small images with a resolution of only 32&#215;32.<\/p>\n<pre class=\"hljs collapse-false language-python\"><code class=\"language-python\">import matplotlib.pyplot as plt\r\nimport numpy as np\r\n\r\n# functions to show an image\r\n\r\ndef imshow(img):\r\n    img = img \/ 2 + 0.5     # unnormalize\r\n    npimg = img.numpy()\r\n    plt.imshow(np.transpose(npimg, (1, 2, 0)))\r\n    plt.show()\r\n\r\n# get some random training images\r\ndataiter = iter(train_dataloader)\r\nimages, labels = next(dataiter)\r\n\r\n# show images\r\nimshow(torchvision.utils.make_grid(images))\r\n# print labels\r\nprint(' '.join(f'{classes[labels[j]]:5s}' for j in range(batch_size)))\r\n<\/code><\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5649109\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-vis.png\" alt=\"\" width=\"2000\" height=\"502\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-vis.png 2000w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-vis-300x75.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-vis-1024x257.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-vis-1536x386.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-vis-300x75@2x.png 600w\" sizes=\"(max-width: 2000px) 100vw, 2000px\" \/><\/p>\n<h3>Create Model<\/h3>\n<p>We create a small Convolutional Neural Network (CNN) model with two convolutional layers, one max pooling, and three fully-connected layers. You can learn more about CNNs in the <a href=\"https:\/\/lightning.ai\/courses\/deep-learning-fundamentals\/unit-7-overview-getting-started-with-computer-vision\/unit-7.3-convolutional-neural-network-architectures\/\">Deep Learning Fundamentals<\/a> course.<\/p>\n<pre class=\"hljs collapse-false language-python\"><code class=\"language-python\">class Net(nn.Module):\r\n    def __init__(self):\r\n        super().__init__()\r\n        self.conv1 = nn.Conv2d(3, 6, 5)\r\n        self.pool = nn.MaxPool2d(2, 2)\r\n        self.conv2 = nn.Conv2d(6, 16, 5)\r\n        self.fc1 = nn.Linear(16 * 5 * 5, 120)\r\n        self.fc2 = nn.Linear(120, 84)\r\n        self.fc3 = nn.Linear(84, 10)\r\n\r\n    def forward(self, x):\r\n        x = self.pool(F.relu(self.conv1(x)))\r\n        x = self.pool(F.relu(self.conv2(x)))\r\n        x = torch.flatten(x, 1) # flatten all dimensions except batch\r\n        x = F.relu(self.fc1(x))\r\n        x = F.relu(self.fc2(x))\r\n        x = self.fc3(x)\r\n        return x\r\n<\/code><\/pre>\n<p>Next, we define the LightningModule to structure our training logic. In the <strong><code>__init__<\/code><\/strong> method, we initialize our model, accuracy metric, and loss function. We use the <strong><code>self.save_hyperparameters()<\/code><\/strong> method to store the learning rate argument. The training and validation steps calculate loss and accuracy and log the results using the <strong><code>self.log<\/code><\/strong> method. Finally, we configure the optimizer by defining SGD in the <strong><code>configure_optimizers<\/code><\/strong> method.<\/p>\n<pre class=\"hljs collapse-false language-python\"><code class=\"language-python\">import pytorch_lightning as pl\r\nfrom torchmetrics import Accuracy\r\n\r\nclass MyLitModel(pl.LightningModule):\r\n    def __init__(self, lr=0.05):\r\n        super().__init__()\r\n        self.save_hyperparameters()\r\n        self.model = Net()\r\n        self.train_accuracy = Accuracy(task=\"multiclass\", num_classes=NUM_CLASSES)\r\n        self.val_accuracy = Accuracy(task=\"multiclass\", num_classes=NUM_CLASSES)\r\n        self.loss_fn = nn.CrossEntropyLoss()\r\n    \r\n    def forward(self, x):\r\n        return self.model(x)\r\n\r\n    def training_step(self, batch, batch_idx):\r\n        x, y = batch\r\n        logits = self(x)\r\n        loss = self.loss_fn(logits, y)\r\n        self.log(\"train_loss\", loss)\r\n        acc = self.train_accuracy(logits, y)\r\n        self.log(\"train_accuracy\", acc)\r\n        return loss\r\n\r\n    def validation_step(self, batch, batch_idx):\r\n        x, y = batch\r\n        logits = self(x)\r\n        loss = self.loss_fn(logits, y)\r\n        self.log(\"val_loss\", loss)\r\n        acc = self.val_accuracy(logits, y)\r\n        self.log(\"val_accuracy\", acc)\r\n\r\n    def configure_optimizers(self):\r\n        optimizer = torch.optim.SGD(self.parameters(), lr=self.hparams.lr, momentum=0.9)\r\n        return optimizer\r\n<\/code><\/pre>\n<h3>Train Model<\/h3>\n<p>We will create an object of <code>MyLitModel<\/code> with a learning rate of 0.001. You can also use the LearningRateFinder callback to discover an appropriate learning rate for your model and data. The model will be trained for 30 epochs, with the Trainer automatically selecting the appropriate device (CPU, GPU, or a TPU) and the number of devices. By default, it will use the total number of available GPUs when set to <code>auto<\/code>.<\/p>\n<pre class=\"hljs collapse-false language-python\"><code class=\"language-python\">from pytorch_lightning.loggers import TensorBoardLogger\r\nfrom pytorch_lightning.callbacks import EarlyStopping\r\n\r\nmodel = MyLitModel(lr=0.001)\r\n\r\ntrainer = pl.Trainer(\r\n    max_epochs=30,\r\n    accelerator=\"auto\",\r\n    devices=\"auto\",\r\n    logger=TensorBoardLogger(save_dir=\"logs\/\"),\r\n\t\tcallbacks=EarlyStopping('val_loss', patience=7),\r\n)\r\n\r\ntrainer.fit(model, train_dataloader, val_dataloader)\r\n<\/code><\/pre>\n<p>You can check the training and validation accuracy curve from the Tensorboard UI to check if there is any overfitting issue.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5649110\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-plot.png\" alt=\"\" width=\"2000\" height=\"850\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-plot.png 2000w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-plot-300x128.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-plot-1024x435.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-plot-1536x653.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/pl-walk-plot-300x128@2x.png 600w\" sizes=\"(max-width: 2000px) 100vw, 2000px\" \/><\/p>\n<p>Once the training is finished, you can export the model and deploy it in production or submit it to an evaluation suite like Kaggle. You can adapt this code to train a much larger model on a large dataset as well. The codes are available <a href=\"https:\/\/github.com\/aniketmaurya\/deep-learning-examples\/blob\/main\/src\/pytorch_lightning\/image-classification\/cifar10.ipynb\">here<\/a>, feel free to give it a try, and use your favorite logger, callback, or trainer flag.<\/p>\n<h2>Conclusion<\/h2>\n<p>PyTorch Lightning streamlines the entire training and evaluation process by eliminating boilerplate code used for organizing the training code and scaling large models. It offers 35 training flags to facilitate improved debugging, monitoring, and scaling.<\/p>\n<p>For complete beginners, I highly recommend enrolling in the <a href=\"https:\/\/lightning.ai\/courses\/deep-learning-fundamentals\/\">Deep Learning Fundamentals<\/a> course. This comprehensive course covers a wide range of topics, starting from the basics of machine learning and deep learning, to utilizing PyTorch for tasks like computer vision, natural language processing, and handling large language models (LLMs).<\/p>\n<p>Additionally, you can delve deeper into PyTorch Lightning by exploring our <a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\/\">documentation<\/a>, which is divided into different levels of expertise.<\/p>\n<p>Join our <a href=\"https:\/\/discord.gg\/STRb9gSVSE\">Discord community<\/a> to ask questions and discuss ideas with the Lightning AI community. \ud83d\udc9c<\/p>\n<p><!-- notionvc: 867561b0-928a-49eb-a2ba-c9f3c19dee5d --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this blog, you will learn about the different components of PyTorch Lightning and how to train an image classifier on the CIFAR-10 dataset with PyTorch Lightning. We will also discuss how to use loggers and callbacks like Tensorboard, ModelCheckpoint, etc. PyTorch Lightning is a high-level wrapper over PyTorch which makes model training easier and<a class=\"excerpt-read-more\" href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/\" title=\"ReadStep-By-Step Walk-Through of Pytorch Lightning\">&#8230; Read more &raquo;<\/a><\/p>\n","protected":false},"author":16,"featured_media":5649112,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":"","_links_to":"","_links_to_target":""},"categories":[29,106,41],"tags":[],"glossary":[],"acf":{"additional_authors":false,"mathjax":false,"default_editor":true,"show_table_of_contents":true,"hide_from_archive":false,"content_type":"Blog Post","sticky":false,"code_embed":false,"tabs":false,"custom_styles":"","table_of_contents":""},"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Step-By-Step Walk-Through of Pytorch Lightning - Lightning AI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Step-By-Step Walk-Through of Pytorch Lightning - Lightning AI\" \/>\n<meta property=\"og:description\" content=\"In this blog, you will learn about the different components of PyTorch Lightning and how to train an image classifier on the CIFAR-10 dataset with PyTorch Lightning. We will also discuss how to use loggers and callbacks like Tensorboard, ModelCheckpoint, etc. PyTorch Lightning is a high-level wrapper over PyTorch which makes model training easier and... Read more &raquo;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/\" \/>\n<meta property=\"og:site_name\" content=\"Lightning AI\" \/>\n<meta property=\"article:published_time\" content=\"2023-10-19T09:09:34+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-10-23T11:55:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/Snap-4.png\" \/>\n\t<meta property=\"og:image:width\" content=\"2926\" \/>\n\t<meta property=\"og:image:height\" content=\"1984\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"JP Hennessy\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@LightningAI\" \/>\n<meta name=\"twitter:site\" content=\"@LightningAI\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"JP Hennessy\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/\"},\"author\":{\"name\":\"JP Hennessy\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6\"},\"headline\":\"Step-By-Step Walk-Through of Pytorch Lightning\",\"datePublished\":\"2023-10-19T09:09:34+00:00\",\"dateModified\":\"2023-10-23T11:55:28+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/\"},\"wordCount\":1224,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/Snap-4.png\",\"articleSection\":[\"Blog\",\"Community\",\"Tutorials\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/\",\"url\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/\",\"name\":\"Step-By-Step Walk-Through of Pytorch Lightning - Lightning AI\",\"isPartOf\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/Snap-4.png\",\"datePublished\":\"2023-10-19T09:09:34+00:00\",\"dateModified\":\"2023-10-23T11:55:28+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#primaryimage\",\"url\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/Snap-4.png\",\"contentUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/Snap-4.png\",\"width\":2926,\"height\":1984},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/lightning.ai\/pages\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Step-By-Step Walk-Through of Pytorch Lightning\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/lightning.ai\/pages\/#website\",\"url\":\"https:\/\/lightning.ai\/pages\/\",\"name\":\"Lightning AI\",\"description\":\"The platform for teams to build AI.\",\"publisher\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/lightning.ai\/pages\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\",\"name\":\"Lightning AI\",\"url\":\"https:\/\/lightning.ai\/pages\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png\",\"contentUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png\",\"width\":1744,\"height\":856,\"caption\":\"Lightning AI\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/LightningAI\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6\",\"name\":\"JP Hennessy\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g\",\"caption\":\"JP Hennessy\"},\"url\":\"https:\/\/lightning.ai\/pages\/author\/jplightning-ai\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Step-By-Step Walk-Through of Pytorch Lightning - Lightning AI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/","og_locale":"en_US","og_type":"article","og_title":"Step-By-Step Walk-Through of Pytorch Lightning - Lightning AI","og_description":"In this blog, you will learn about the different components of PyTorch Lightning and how to train an image classifier on the CIFAR-10 dataset with PyTorch Lightning. We will also discuss how to use loggers and callbacks like Tensorboard, ModelCheckpoint, etc. PyTorch Lightning is a high-level wrapper over PyTorch which makes model training easier and... Read more &raquo;","og_url":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/","og_site_name":"Lightning AI","article_published_time":"2023-10-19T09:09:34+00:00","article_modified_time":"2023-10-23T11:55:28+00:00","og_image":[{"width":2926,"height":1984,"url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/Snap-4.png","type":"image\/png"}],"author":"JP Hennessy","twitter_card":"summary_large_image","twitter_creator":"@LightningAI","twitter_site":"@LightningAI","twitter_misc":{"Written by":"JP Hennessy","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#article","isPartOf":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/"},"author":{"name":"JP Hennessy","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6"},"headline":"Step-By-Step Walk-Through of Pytorch Lightning","datePublished":"2023-10-19T09:09:34+00:00","dateModified":"2023-10-23T11:55:28+00:00","mainEntityOfPage":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/"},"wordCount":1224,"commentCount":0,"publisher":{"@id":"https:\/\/lightning.ai\/pages\/#organization"},"image":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#primaryimage"},"thumbnailUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/Snap-4.png","articleSection":["Blog","Community","Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/","url":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/","name":"Step-By-Step Walk-Through of Pytorch Lightning - Lightning AI","isPartOf":{"@id":"https:\/\/lightning.ai\/pages\/#website"},"primaryImageOfPage":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#primaryimage"},"image":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#primaryimage"},"thumbnailUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/Snap-4.png","datePublished":"2023-10-19T09:09:34+00:00","dateModified":"2023-10-23T11:55:28+00:00","breadcrumb":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#primaryimage","url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/Snap-4.png","contentUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/10\/Snap-4.png","width":2926,"height":1984},{"@type":"BreadcrumbList","@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/step-by-step-walk-through-of-pytorch-lightning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/lightning.ai\/pages\/"},{"@type":"ListItem","position":2,"name":"Step-By-Step Walk-Through of Pytorch Lightning"}]},{"@type":"WebSite","@id":"https:\/\/lightning.ai\/pages\/#website","url":"https:\/\/lightning.ai\/pages\/","name":"Lightning AI","description":"The platform for teams to build AI.","publisher":{"@id":"https:\/\/lightning.ai\/pages\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/lightning.ai\/pages\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/lightning.ai\/pages\/#organization","name":"Lightning AI","url":"https:\/\/lightning.ai\/pages\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/","url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png","contentUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png","width":1744,"height":856,"caption":"Lightning AI"},"image":{"@id":"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/LightningAI"]},{"@type":"Person","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6","name":"JP Hennessy","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g","caption":"JP Hennessy"},"url":"https:\/\/lightning.ai\/pages\/author\/jplightning-ai\/"}]}},"_links":{"self":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts\/5649104"}],"collection":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/users\/16"}],"replies":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/comments?post=5649104"}],"version-history":[{"count":0,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts\/5649104\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/media\/5649112"}],"wp:attachment":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/media?parent=5649104"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/categories?post=5649104"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/tags?post=5649104"},{"taxonomy":"glossary","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/glossary?post=5649104"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}