Share Files Between Components¶
Note
The contents of this page is still in progress!
Audience: Users who want to share files between components.
Why do I need distributed storage?¶
In a Lightning App some components can be executed on their own hardware. Distributed storage enables a file saved by a component on one machine to be used by components in other machines (transparently).
If you’ve asked the question “how do I use the checkpoint from this model to deploy this other thing”, you’ve needed distributed storage.
Write a file¶
To write a file, first create a reference to the file with the Path
class, then write to it:
from lightning.app.storage import Path
# file reference
boring_file_reference = Path("boring_file.txt")
# write to that file
with open(self.boring_file_reference, "w") as f:
f.write("yolo")
Use a file¶
To use a file, pass the reference to the file:
f = open(boring_file_reference, "r")
print(f.read())
Example: Share a model checkpoint¶
A common workflow in ML is to use a checkpoint created by another component. First, define a component that saves a checkpoint:
import os
import torch
from lightning.app import LightningWork, LightningFlow, LightningApp
from lightning.app.storage.path import Path
class ModelTraining(LightningWork):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.checkpoints_path = Path("./checkpoints")
def run(self):
# make fake checkpoints
checkpoint_1 = torch.tensor([0, 1, 2, 3, 4])
checkpoint_2 = torch.tensor([0, 1, 2, 3, 4])
os.makedirs(self.checkpoints_path, exist_ok=True)
checkpoint_path = str(self.checkpoints_path / "checkpoint_{}.ckpt")
Next, define a component that needs the checkpoints:
torch.save(checkpoint_1, str(checkpoint_path).format("1"))
torch.save(checkpoint_2, str(checkpoint_path).format("2"))
class ModelDeploy(LightningWork):
def __init__(self, ckpt_path, *args, **kwargs):
super().__init__()
self.ckpt_path = ckpt_path
def run(self):
ckpts = os.listdir(self.ckpt_path)
checkpoint_1 = torch.load(os.path.join(self.ckpt_path, ckpts[0]))
Link both components via a parent component:
checkpoint_2 = torch.load(os.path.join(self.ckpt_path, ckpts[1]))
print(f"Loaded checkpoint_1: {checkpoint_1}")
print(f"Loaded checkpoint_2: {checkpoint_2}")
class LitApp(LightningFlow):
def __init__(self):
super().__init__()
self.train = ModelTraining()
self.deploy = ModelDeploy(ckpt_path=self.train.checkpoints_path)
def run(self):
self.train.run()
self.deploy.run()
app = LightningApp(LitApp())
Run the app above with the following command:
lightning run app docs/source/workflows/share_files_between_components/app.py
Your Lightning App is starting. This won't take long.
INFO: Your app has started. View it in your browser: http://127.0.0.1:7501/view
Loaded checkpoint_1: tensor([0, 1, 2, 3, 4])
Loaded checkpoint_2: tensor([0, 1, 2, 3, 4])
For example, here we save a file on one component and use it in another component:
from lightning.app.storage import Path
class ComponentA(LightningWork):
def __init__(self):
super().__init__()
self.boring_path = None
def run(self):
# This should be used as a REFERENCE to the file.
self.boring_path = Path("boring_file.txt")
with open(self.boring_path, "w") as f:
f.write(FILE_CONTENT)