Level: Beginner
Prerequisite: Knowledge of Lightning AI, framework basics
For researchers and machine learning engineers alike implementing databases for ML systems is a chore.
There are a number of reasons to hate dealing with datasets:
- Bias
- Issues with labeling
- Time-consuming
- Trust factor with public data
- Small sample sizes
- Stale data
- Data not relevant to the task
However, working with data and adding databases to your ML systems is an important part that cannot be overlooked as you train and scale your machine learning models. So, let’s push past any bad feelings about data and learn how to add a database to a machine learning application.
Databases are a fundamental piece of software services. Users rely on them to guarantee the security and integrity of their data, as well as other pertinent details such as assets or transaction records.
While building Lightning AI, we factored the need for databases heavily into our decision-making process. As the Lightning ecosystem continues to evolve, we have enabled basic functionalities to setup “transient databases” with the app-building tools themselves.
In this blog post, we’ll take a look at these tools and how you can use them to create database Components. We’ll also explore how to use these components to build a Redis database in a machine learning application called a Lightning App that talks to that database. We are using Redis as an example here, but the functionalities presented in this post will enable you to set up any database Component by yourself.
If you’re searching for a database Component, make sure to first check the Component Gallery before re-inventing the wheel. You may find that a Component for your favorite database has already been created!
Let’s begin by making sure we understand what a “transient database” is.
In order to enable users to setup databases without needing to go through a hosted service, we’ll set up database using a LightningWork. These databases will run the service on the node at which the LightningWork is running (in AWS, they are EC2 instances).
Setting up a transient database is particularly useful if you don’t require the bells and whistles of a hosted database or are just testing out a simple application. The tradeoff here is a reduction in the amount of data persistence and availability in your database, which can be mitigated through the use of Drive
and automatic restart.
Some important points to note are that:
- Transient databases cannot reliably backup data; you can attempt this with Drive, but those backups aren’t guaranteed
- Transient databases cannot scale horizontally
- Transient databases don’t guarantee high availability or data recovery
Despite these limitations, transient databases are incredibly useful in a number of common scenarios:
- You need none of the features mentioned above, for example, caching with Redis
- You are testing out an application and will be moving to a proper hosted solution once you’ve completed development
- Your existing application requires a database, but the above features are unnecessary
In the following tutorial, we’ll set up a Redis Component using LightningWork and show how we can use it in a Lightning App. Redis has been chosen arbitrarily and the fundamentals we explore in this tutorial will allow you to set up a Component for any of your preferred databases.
Redis Component
Lightning AI abstracts away basic infrastructure pieces like compute and network management from developers. You’ll maintain these abstractions whenever you inherit from LightningWork or LightningFlow base class and let it run using LightningApp. If you need a refresher on LightningFlow or LightningWork, check out our quick start.
Below is a trimmed-down version of a practical Redis Component. Let’s dive into each of the key pieces that make this Component work.
from lightning.app import LightningWork from lightning_app import BuildConfig import redis class RedisComponent(LightningWork): def __init__(self): build_config = BuildConfig(image="ghcr.io/gridai/lightning-redis:v0.1") super().__init__(parallel=True, cloud_build_config=build_config) self.redis_host = None self.redis_port = None self.redis_password = None def run(self): self.redis_host = self.internal_ip self.redis_port = self.port self.redis_password = os.getenv("REDIS_PASSWORD", rand_password_gen()) if not RUNNING_AT_CLOUD: if self._has_docker_installed(): self._init_redis(docker=True) else: raise RuntimeError("Cannot run redis locally") else: self._init_redis(docker=False) while True: self.running = self._is_redis_running(password=self.redis_password) if not self.running: raise RuntimeError("Redis is not running") time.sleep(1)
Let’s also make a dummy Lightning App to demonstrate how to actually use this Component. Below is the root LightningFlow that initiates a single LightningWork which is our Redis Component. In the run
method, we run the Redis Component and check its status.
class LitApp(LightningFlow): def __init__(self) -> None: super().__init__() self.lightning_redis = RedisComponent() def run(self): self.lightning_redis.run() if not self.lightning_redis.running: print("redis is down") else: print( "is redis up?, ", redis.Redis( host=self.lightning_redis.redis_host, port=self.lightning_redis.redis_port, password=self.lightning_redis.redis_password, ).ping(), )
The run
method
This method houses the most interesting piece of code. It’s key for both LightningFlow and LightningWork, which you can explore further in our documentation.
The run
method of our LightningFlow component calls the run
method of our Redis component. Right after that call, we check whether Redis is running or not. Once we call lightning_redis.run()
, there is an initialization time that the Redis component needs to spin up a new machine/container and begin Redis service there.
There is also the possibility of a potential future service interruption and when that happens, the condition check on self.lightning_redis.running
will turn False. The LightningFlow component needs to act on that state change. For brevity, we are only printing redis is down
in this blog post, but we can do a lot more there, like sending a notification via email or restarting the service altogether.
Now, let’s look at the run
method of our Redis component.
Our goal with the run
method of this Redis component is to bring up the Redis server and continuously check the status and update other components if something goes wrong, for example if the server dies. First, we set up a Redis server based on where it is running. If it’s being run locally, we need to first check whether the user has docker installed, since we’re using the pre-built base image. Then, we’ll trigger an ever-running while
loop that continually checks the status of the Redis server and exits if the server is stopped.
if not RUNNING_AT_CLOUD: if self._has_docker_installed(): self._init_redis(docker=True) else: raise RuntimeError("Cannot run redis locally") else: self._init_redis(docker=False) while True: self.running = self._is_redis_running(password=self.redis_password) if not self.running: raise RuntimeError("Redis is not running") time.sleep(1)
If the server is not running, we do two things:
- Set the attribute
running
toFalse
so that other components keep checking this attributed - Raise exception, which ultimately exits the work
That’s it! That’s the core of running a Redis (or any database) server and using it in a Lightning App.
In the following sections, we’ll explore the different pieces that make these components run.
Build Config
In the above example, we used a BuildConfig
object and passed a base image with the utilities necessary for running a LightningWork. Build Config allows you to “build” this base. This includes, for instance, installing several system packages or changing the permission of certain files. In this example, we are using a base image prebuilt with the tools we need, like a Redis server and the Redis module we needed, so we don’t have to install them each time we bring the Redis Component up.
Check out the BuildConfig documentation to learn more about this.
BuildConfig(image="ghcr.io/gridai/lightning-redis:v0.1")
We pass the build_config
object into the __init__
of the parent class LightningWork which lets the LightningWork know to use our BuildConfig, as opposed to the default one.
Parallel Work
You may have noticed the parallel=True
argument we that we passed to the __init__
of the LightningWork super class.
This sets our LightningWork to run parallel to everything else. Without this, the run
method of our Redis component will be a blocking call in the main process. For instance, the dummy app we depicted above has the run
method that calls the run
method of Redis component:
def run(self): self.lightning_redis.run() print("this line would never be reached with out parallel=True")
If parallel
were set to False
, the print statement would never reach.
Can you guess why? Our Redis component’s run
method has a while
loop at the end that will not finish without an exception.
Host address and port
Every LightningWork exposes the one port and the internal IP address that we can then use to expose our service to another LightningWork. That’s what these lines from the run
method of our Redis component do:
def run(self): self.redis_host = self.internal_ip self.redis_port = self.port
The variables redis_host
and redis_port
are accessible in the parent LightningFlow, which can transfer these values down to other components. For instance, in the above dummy Lightning App example, we use this information to ping the Redis server using the Redis Python client:
redis.Redis( host=self.lightning_redis.redis_host, port=self.lightning_redis.redis_port ).ping()
Note that these variables are initialized in the __init__
method with None
and are filled with values in the run
method.
This is because the __init__
method might be running in multiple machines (wherever we initialize that object), but the run
method runs in only one specific machine. We need to get the IP address and port of the machine where the run
method is running from, because that’s what we use to expose the Redis server.
Be aware that the internal_ip
attribute of LightningWork gives the IP address that’s internal to the network your app is running (when you are running in cloud) and wouldn’t be accessible from the public network. If you need to access one of your services from a public network, like from a browser, use the host
or url
attributes of LightningWork.
Because they only support http
and https
protocols, however, you cannot expose something like a Redis server that communicates using a different protocol.
Custom Credential
When setting up a database, many users want to use secure password protection.
Passing secrets to your component is not currently supported by Lightning AI, but you can still make use of the env variables to pass your own password. Here is how you do that.
If you’re passing an env variable when creating a Lightning App as shown below, we’ll make this env variable available in all the machines that run under the Lightning App. That includes all the LightningWork machines and the LightningFlow machine.
lightning run app app.py --env VAR_NAME=VAR_VALUE
Utilizing this feature, we can enable a custom password for the Redis server. Take a close look at this line in the __init__
of the Redis component:
self.redis_password = os.getenv("REDIS_PASSWORD", rand_password_gen())
We read the user-supplied password from the env variable, and if one is not given, we generate a random password. Anyone who uses this component can pass a custom password while running the Lightning App, as shown here:
lightning run app app.py --env REDIS_PASSWORD=<your-password>
Regardless of what password the Redis server used for the auth setup, the password will be available at the redis_password
attribute of the work component. In fact, we used this in the dummy Lightning App example above to connect to Redis:
redis.Redis( host=self.lightning_redis.redis_host, port=self.lightning_redis.redis_port, password=self.lightning_redis.redis_password, ).ping(),
Installing Redis Component
We have built a tested version of this component and it is being reviewed by the component review team for placement in the Components Gallery. Until it’s published in the gallery, you can install it using:
lightning install component git+https://github.com/Lightning-AI/LAI-Redis-Component.git@main
Check out the documentation for more details.
Wrap up!
Lightning AI is designed to become the OS for AI. It exposes the basic building blocks on which we can build components and AI apps (or as we call it, Lightning Apps).
What we built here is to run Redis server as a subprocess in a LightningWork component. For a lot of use cases this is more than enough, while other applications would need more reliable and stable database instances, such as the one we get from hosted services like RDB from AWS. While the goal of this blog post wasn’t that, keep an eye on our Components Gallery for components that are built for just that.
Or even better, we welcome you to build one for the community.
By Sherin Thomas, Senior Software Engineer Lightning AI