Add Cloud Data

Audience: Users who want to read files stored in a Cloud Object Bucket in an app.

Mounting Public AWS S3 Buckets

Add Mount to a Work

To mount data from a cloud bucket to your app compute, initialize a Mount object with the source path of the s3 bucket and the absolute directory path where it should be mounted and pass the Mount to the CloudCompute of the LightningWork it should be mounted on.

In this example, we will mount an S3 bucket: s3://ryft-public-sample-data/esRedditJson/ to /content/esRedditJson/.

from lightning.app import CloudCompute
from lightning.app.storage import Mount

self.my_work = MyWorkClass(
    cloud_compute=CloudCompute(
        mounts=Mount(
            source="s3://ryft-public-sample-data/esRedditJson/",
            mount_path="/content/esRedditJson/",
        ),
    )
)

You can also pass multiple mounts to a single work by passing a List[Mount(...), ...] to the CloudCompute(mounts=...) argument.

Note

  • Mounts supported up to 1 Million files, 5GB per file. Need larger mounts? Contact support@lightning.ai

  • When adding multiple mounts, each one should have a unique mount_path.

  • A maximum of 10 Mounts can be added to a LightningWork.

Read Files From a Mount

Once a Mount object is passed to CloudCompute, you can access, list, or read any file from the mount under the specified mount_path, just like you would if it was on your local machine.

Assuming your mount_path is "/content/esRedditJson/" you can do the following:

Read Files

with open("/content/esRedditJson/esRedditJson1", "r") as f:
    some_data = f.read()

# do something with "some_data"...

List Files

files = os.listdir("/content/esRedditJson/")

See the Full Example

import os

import lightning as L
from lightning.app import CloudCompute
from lightning.app.storage import Mount

class ReadMount(L.LightningWork):
   def run(self):
       # Print a list of files stored in the mounted S3 Bucket.
       files = os.listdir("/content/esRedditJson/")
       for file in files:
           print(file)

       # Read the contents of a particular file in the bucket "esRedditJson1"
       with open("/content/esRedditJson/esRedditJson1", "r") as f:
           some_data = f.read()
           # do something with "some_data"...

class Flow(L.LightningFlow):
   def __init__(self):
       super().__init__()
       self.my_work = ReadMount(
           cloud_compute=CloudCompute(
               mounts=Mount(
                   source="s3://ryft-public-sample-data/esRedditJson/",
                   mount_path="/content/esRedditJson/",
               ),
           )
       )

   def run(self):
       self.my_work.run()

Note

When running a Lightning App on your local machine, any CloudCompute configuration (including a Mount) is ignored at runtime. If you need access to these files on your local disk, you should download a copy of them to your machine.

Note

Mounted files from an S3 bucket are read-only. Any modifications, additions, or deletions to files in the mounted directory will not be reflected in the cloud object store.


Mounting Private AWS S3 Buckets - Coming Soon!

We’ll Let you know when this feature is ready!


Mounting Google Cloud GCS Buckets - Coming Soon!

We’ll Let you know when this feature is ready!