Compiling & Quantizing Ultralytics YOLOv5 For Better Performance With Deci

Optimize and deploy Ultralytics YOLOv5 models with Deci's platform, enhancing performance by up to 10x. Get started for free and leverage automatic model optimization.

Written by

Ultralytics Team

min read

Oct 26, 2022

Apr 13, 2025

Why Deci?

With Deci You Can:

Improve Inference Performance By Up To 10x

‍Find The Best Inference Hardware For Your Application

‍Deploy With A Few Lines Of Code

Ready To Get Started?

At Ultralytics we commercially partner with other startups to help us fund the research and development of our awesome open-source tools, like YOLOv5, to keep them free for everybody. This article may contain affiliate links to those partners.

The Deci platform includes free tools for easily managing, optimizing, and deploying your YOLOv5 models in any production environment. Deci supports all popular DL frameworks, such as TensorFlow, PyTorch, Keras, and ONNX. All you need is our web-based platform or our Python client to run it from your code.

Why Deci?

You can use Deci for not only exporting but also for pruning and quantization of the model!

Deci provides a nice interface for exporting in any format and performance comparison between the original and converted models. Users choose to further optimize their models by quantization.

With Deci You Can:

Improve Inference Performance By Up To 10x

‍Automatically compile and quantize your models and evaluate different production settings to achieve better latency, throughout, and reduction of the model size and memory footprint on your hardware.

‍Find The Best Inference Hardware For Your Application

‍Benchmark your model's performance on various hardware (including edge) devices with a button. Eliminate the need to manually set up and test multiple hardware and production settings.

‍Deploy With A Few Lines Of Code

‍Leverage Deci's python-based inference engine. Compatible with multiple frameworks and hardware types.

For more information about the Deci Platform please visit Deci's website.

First-Time Setup

Step 1

Open your free account.

Get Started With Deci and Ultralytics YOLOv5

Step 2

To start optimizing your pre-trained YOLOv5 model, you will need to convert it to ONNX format. See YOLOv5 Export Tutorial for instructions on how to convert your model to ONNX format.

Step 3

Go to the "Lab" tab and click the "New Model" button in the top right part of the screen to upload your YOLOv5 ONNX model.

Convert Ultralytics YOLOv5 models to ONNX for future deployment with Deci

Follow the steps of the model upload wizard to select your target hardware as well as desired batch size and quantization level for the model compilation.

Ultralytics YOLOv5 model compilation for deployment with Deci

After filling in the relevant information, click "Start". The Deci platform will automatically perform a runtime optimization of your YOLOv5 model for the hardware you selected as well as benchmark your model on various hardware types. This process takes approximately 10 minutes.

Once done, a new row will appear on your screen underneath the baseline model you previously uploaded. Here you can see the optimized version of your pre-trained YOLOv5 model.

Ultralytics YOLOv5 optimized model for deployment with Deci

What's Next?

You can then download your optimized model by clicking on the "Deploy" button.

You will then be prompted to download your model and receive instructions on how to install and use Infery - Deci's runtime inference engine.

The use of Infery is optional. You can get the python raw files and use them with any other inference engine of your choice.

Use Deci Infery to deploy Ultralytics YOLOv5

Explore the optimization and benchmark results on the "Insights" tab.

Optimization with Deci of Ultralytics YOLOv5 model

Ready To Get Started?

Before wrapping up, let’s discuss some of the advantages Deci offers:

Optimize your model’s inference throughput and latency without compromising accuracy
Allows you to optimize models from all the popular frameworks
Supports models targeted at any deep-learning task
Supports deployment on popular CPU and GPU machines
Benchmarks the fitness of your model on different hardware hosts and cloud providers
Gets uploaded models ready for serving, inference, and deployment

As you have just seen, you can double the performance of a YOLOv5 model in 15 minutes overall time. The Deci platform is super easy and intuitive to use.

Any questions? Join our community and leave your question today!

‍

Compiling & Quantizing Ultralytics YOLOv5 For Better Performance With Deci

Why Deci?

With Deci You Can:

Improve Inference Performance By Up To 10x