Introducing Lit-GPT: Hackable implementation of open-source large language models released under Apache 2.0 →

← Back to blog

Object Detection on the Cloud with YOLOv8

From non-neural network approaches like Viola–Jones to deep neural networks architectures like the R-CNN family and YOLO model series, object detection has come a long way in recent years. YOLO (You Only Look Once), a novel and efficient approach to object detection, was first released in 2015. It gained popularity because, unlike earlier architectures, YOLO could perform the detection as a single network by predicting bounding boxes and class probabilities in a single forward pass. It allowed end-to-end optimization and inspired new model architectures for edge devices. In this blog post, we look at the latest YOLOv8 model released by Ultralytics. This model is a successor to their widely-used YOLOv5 model. In the following, we will explore this model’s new features and cover how you can use Lightning to deploy this model to perform object detection on the cloud.

Deploy on the cloud

Lightning enables you to quickly and easily deploy models like YOLOv8 on the cloud. Learn more about building a training and deployment pipeline and how to scale your model serving with Lightning.

YOLOv5 became widely adopted as a result of its simplicity and performance, and because it requires minimal hyperparameter tuning to train a model. The latest model (YOLOv8) maintains all the excellent features of the previous version and introduces an improved developer experience for the training, finetuning, and deployment of models.

The team at YOLOv8 is moving quickly to add new features and will release the paper very soon. Meanwhile, it is open-source and you can start using it right away.


What is Ultralytics?

Ultralytics is a new computer vision framework by the creators of YOLOv5. It is the result of their learnings from YOLOv5 and continuous research and development efforts. The name of the installable YOLOv8 package is ultralytics, and to install the package you can enter pip install ultralytics in your terminal.

Major Ultralytics features

  • Ease of installation: It comes with a PyPI package where you can install the code along with all the dependencies.
  • Improved command line interface (CLI): The new CLI provides the functionalities to do training, validation, prediction, and model serialization to optimized formats like ONNX and TensorRT.
  • Python interface: It consists of a Pythonic API that can be used to train models from any Python environment and even in an interactive Jupyter Notebook.
  • Training support for multiple YOLO versions: With the latest Ultralytics package you can not only train YOLOv8 but it will also support training YOLOv5 and coming versions of YOLO.
  • Anchorless: An important distinguishing feature of YOLOv8 and YOLOv5 is that it is anchorless.

More information and docs are available on GitHub.

Best Practices for using the Ultralytics module

  • Save the model as so that you always save your training weights along with the model configuration.
  • Use default augmentation and hyperparameters if you’re not sure about them
  • Use streaming mode if you’re running inference on long videos or streams



The following is the benchmark for YOLOv8 against the previous versions on the COCO dataset.

Fig 1. YOLOv8 benchmark against previous versions


How to deploy your YOLOv8 model

So, you’ve trained a custom object detection model. What next?

Let’s deploy this model in such a way that it scales out based on traffic without human interference.

For this tutorial, we’ll export the model to TorchScript format. TorchScript is a serializable and optimizable format for PyTorch code. A benefit of using this format is that it releases the GIL lock.


Using the CLI:

yolo export format=torchscript


With a Python API:

model.export (format="torchscript")


After exporting, we build our model server. We’ll be using the PythonServer component (a reusable and customizable unit for building distributed Python applications) to deploy the model.

To build a model server using PythonServer, we need to create a class inheriting PythonServer and implement two abstract methods: setup(...) and predict(...).

We create our model inside the setup method and write the prediction logic inside the predict method:

# !pip install ultralytics lightning-api-access
# !pip uninstall -y opencv-python opencv-python-headless
# !pip install opencv-python-headless== import lightning as L
from import PythonServer
from pydantic import BaseModel class InputType(BaseModel):
image_url: str class Detections(BaseModel):
prediction: list class YoloV8Server(PythonServer):
def setup(self):
from ultralytics import YOLO
self._model = YOLO("") def predict(self, request: InputType):
preds = self._model.predict(request.image_url)[0]
classes = preds.boxes.cls
results = [self._model.names[int(cls)] for cls in classes]
return {"prediction": results} component = YoloV8Server(input_type=InputType, output_type=Detections)
app = L.LightningApp(component)


To run the model server open the terminal and enter lightning run app A browser tab will open with the API documentation.

To deploy this application on the cloud, simply append the cloud flag and your model will be deployed on the Lightning AI cloud platform:

lightning run app --cloud


You can add additional power to your server by enabling AutoScaling, a Lightning component that scales the model server automatically based on traffic. To use AutoScaler, we need to import the component and move your YoloV8Server inside the AutoScaler component:

from import AutoScaler


component = AutoScaler(Yolov8Server, input_type=InputType, output_type=Detections)

app = L.LightningApp(component)


And that’s it! Now we’ve built a custom model server, deployed it on the cloud, and even introduced automated scaling based on traffic in just a few lines of code.


Deploy on the cloud

Lightning gives you thirty free credits every month that you can use to deploy models like YOLOv8 on the cloud.

Sign up for Lightning!