• Docs >
  • Start from an ML system template
Shortcuts

Start from an ML system template

Required background: Basic Python familiarity and complete the install guide.

Goal: We’ll walk you through the 4 key steps to run a Lightning App that trains and demos a model.


The Train & Demo PyTorch Lightning Application

Find the Train & Demo PyTorch Lightning application in the Lightning.ai App Gallery.

Here is a recording of this App running locally and in the cloud with the same behavior.




In the steps below, we are going to show you how to build this application.

Here are the entire App’s code and its commented components.


Step 1: Install Lightning

If you are using a virtual env, don’t forget to activate it before running commands. You must do so in every new shell.

Tip

We highly recommend using virtual environments.

pip install lightning

Step 2: Install the Train and Demo App

The first Lightning App we’ll explore is an App to train and demo a machine learning model.

Install this App by typing:

lightning install app lightning/quick-start

Verify the App was succesfully installed:

cd lightning-quick-start

Step 3: Run the App locally

Run the app locally with the run command 🤯

lightning run app app.py

Step 4: Run the App in the cloud

Add the --cloud argument to run on the Lightning.AI cloud. 🤯🤯🤯

lightning run app app.py --cloud

Understand the code

The App that we just launched trained a PyTorch Lightning model (although any framework works), then added an interactive demo.

This is the App’s code:

# lightning-quick-start/app.py
import os.path as ops
import lightning as L
from quick_start.components import PyTorchLightningScript, ImageServeGradio

class TrainDeploy(L.LightningFlow):
    def __init__(self):
        super().__init__()
        self.train_work = PyTorchLightningScript(
            script_path=ops.join(ops.dirname(__file__), "./train_script.py"),
            script_args=["--trainer.max_epochs=5"],
        )

        self.serve_work = ImageServeGradio(L.CloudCompute())

    def run(self):
        # 1. Run the python script that trains the model
        self.train_work.run()

        # 2. when a checkpoint is available, deploy
        if self.train_work.best_model_path:
            self.serve_work.run(self.train_work.best_model_path)

    def configure_layout(self):
        tab_1 = {"name": "Model training", "content": self.train_work}
        tab_2 = {"name": "Interactive demo", "content": self.serve_work}
        return [tab_1, tab_2]

app = L.LightningApp(TrainDeploy())

Let’s break down the code section by section to understand what it is doing.


1: Define root component

A Lightning App provides a cohesive product experience for a set of unrelated components.

The top-level component (Root) must subclass L.LightningFlow

# lightning-quick-start/app.py
import os.path as ops
import lightning as L
from quick_start.components import PyTorchLightningScript, ImageServeGradio

class TrainDeploy(L.LightningFlow):
    def __init__(self):
        super().__init__()
        self.train_work = PyTorchLightningScript(
            script_path=ops.join(ops.dirname(__file__), "./train_script.py"),
            script_args=["--trainer.max_epochs=5"],
        )

        self.serve_work = ImageServeGradio(L.CloudCompute("cpu-small"))

    def run(self):
        # 1. Run the python script that trains the model
        self.train_work.run()

        # 2. when a checkpoint is available, deploy
        if self.train_work.best_model_path:
            self.serve_work.run(self.train_work.best_model_path)

    def configure_layout(self):
        tab_1 = {"name": "Model training", "content": self.train_work}
        tab_2 = {"name": "Interactive demo", "content": self.serve_work}
        return [tab_1, tab_2]

app = L.LightningApp(TrainDeploy())

2: Define components

In the __init__ method, we define the components that make up the App. In this case, we have 2 components, a component to execute any PyTorch Lightning script (model training) and a second component to start a Gradio server for demo purposes.

# lightning-quick-start/app.py
import os.path as ops
import lightning as L
from quick_start.components import PyTorchLightningScript, ImageServeGradio

class TrainDeploy(L.LightningFlow):
    def __init__(self):
        super().__init__()
        self.train_work = PyTorchLightningScript(
            script_path=ops.join(ops.dirname(__file__), "./train_script.py"),
            script_args=["--trainer.max_epochs=5"],
        )

        self.serve_work = ImageServeGradio(L.CloudCompute("cpu-small"))

    def run(self):
        # 1. Run the python script that trains the model
        self.train_work.run()

        # 2. when a checkpoint is available, deploy
        if self.train_work.best_model_path:
            self.serve_work.run(self.train_work.best_model_path)

    def configure_layout(self):
        tab_1 = {"name": "Model training", "content": self.train_work}
        tab_2 = {"name": "Interactive demo", "content": self.serve_work}
        return [tab_1, tab_2]

app = L.LightningApp(TrainDeploy())

3: Define how components Flow

Every component has a run method. The run method defines the 🌊 Flow 🌊 of how components interact together.

In this case, we train a model (until completion). When it’s done AND there exists a checkpoint, we launch a demo server:

# lightning-quick-start/app.py
import os.path as ops
import lightning as L
from quick_start.components import PyTorchLightningScript, ImageServeGradio

class TrainDeploy(L.LightningFlow):
    def __init__(self):
        super().__init__()
        self.train_work = PyTorchLightningScript(
            script_path=ops.join(ops.dirname(__file__), "./train_script.py"),
            script_args=["--trainer.max_epochs=5"],
        )

        self.serve_work = ImageServeGradio(L.CloudCompute("cpu-small"))

    def run(self):
        # 1. Run the python script that trains the model
        self.train_work.run()

        # 2. when a checkpoint is available, deploy
        if self.train_work.best_model_path:
            self.serve_work.run(self.train_work.best_model_path)

    def configure_layout(self):
        tab_1 = {"name": "Model training", "content": self.train_work}
        tab_2 = {"name": "Interactive demo", "content": self.serve_work}
        return [tab_1, tab_2]

app = L.LightningApp(TrainDeploy())

Note

If you’ve used other ML systems you’ll be pleasantly surprised to not find decorators or YAML files.


4: Connect web user interfaces

All our favorite tools normally have their own web user interfaces (UI).

Implement the configure_layout method to connect them together:

# lightning-quick-start/app.py
import os.path as ops
import lightning as L
from quick_start.components import PyTorchLightningScript, ImageServeGradio

class TrainDeploy(L.LightningFlow):
    def __init__(self):
        super().__init__()
        self.train_work = PyTorchLightningScript(
            script_path=ops.join(ops.dirname(__file__), "./train_script.py"),
            script_args=["--trainer.max_epochs=5"],
        )

        self.serve_work = ImageServeGradio(L.CloudCompute("cpu-small"))

    def run(self):
        # 1. Run the python script that trains the model
        self.train_work.run()

        # 2. when a checkpoint is available, deploy
        if self.train_work.best_model_path:
            self.serve_work.run(self.train_work.best_model_path)

    def configure_layout(self):
        tab_1 = {"name": "Model training", "content": self.train_work}
        tab_2 = {"name": "Interactive demo", "content": self.serve_work}
        return [tab_1, tab_2]

app = L.LightningApp(TrainDeploy())

5: Init the app object

Initialize an app object with the TrainDeploy component (this won’t run the App yet):

# lightning-quick-start/app.py
import os.path as ops
import lightning as L
from quick_start.components import PyTorchLightningScript, ImageServeGradio

class TrainDeploy(L.LightningFlow):
    def __init__(self):
        super().__init__()
        self.train_work = PyTorchLightningScript(
            script_path=ops.join(ops.dirname(__file__), "./train_script.py"),
            script_args=["--trainer.max_epochs=5"],
        )

        self.serve_work = ImageServeGradio(L.CloudCompute("cpu-small"))

    def run(self):
        # 1. Run the python script that trains the model
        self.train_work.run()

        # 2. when a checkpoint is available, deploy
        if self.train_work.best_model_path:
            self.serve_work.run(self.train_work.best_model_path)

    def configure_layout(self):
        tab_1 = {"name": "Model training", "content": self.train_work}
        tab_2 = {"name": "Interactive demo", "content": self.serve_work}
        return [tab_1, tab_2]

app = L.LightningApp(TrainDeploy())

What components are supported?

Any component can work with Lightning AI!

What is Lightning gif.

Next Steps