Comparing Ultralytics YOLO11 vs previous YOLO models

From automating everyday tasks to helping make informed decisions in real time, artificial intelligence (AI) is reshaping the future of various industries. One particularly fascinating area of AI is computer vision, otherwise known as Vision AI. It focuses on enabling machines to analyze and interpret visual data like humans do.

Specifically, computer vision models are driving innovations that enhance both safety and efficiency. For example, these models are used in self-driving cars to detect pedestrians and in security cameras to monitor premises around the clock.

Some of the most well-known computer vision models are the YOLO (You Only Look Once) models, known for their real-time object detection capabilities. Over time, YOLO models have improved, with each new version offering better performance and more flexibility.

Newer versions like Ultralytics YOLO11 can handle a variety of tasks, like instance segmentation, image classification, pose estimation, and multi-object tracking, with better accuracy, speed, and precision than ever before.

In this article, we’ll compare Ultralytics YOLOv8, YOLOv9, YOLOv10, and Ultralytics YOLO11 to get a better idea of how these models have evolved. We’ll analyze their key features, benchmark results, and performance differences. Let’s get started!

An overview of Ultralytics YOLOv8

YOLOv8, released by Ultralytics on January 10, 2023, was a major step forward compared to earlier YOLO models. It’s optimized for real-time, accurate detection, combining well-tested approaches with innovative updates for better results.

Going beyond object detection, it also supports the following computer vision tasks: instance segmentation, pose estimation, oriented bounding boxes (OBB) object detection, and image classification. Another important feature of YOLOv8 is that it is available as five different model variants - Nano, Small, Medium, Large, and X - so you can choose the right balance of speed and accuracy based on your needs.

Due to its versatility and strong performance, YOLOv8 can be used in many real-world applications, like security systems, smart cities, healthcare, and industrial automation.

__wf_reserved_inherit — Fig 1. Parking management in smart cities with YOLOv8.

‍

Key features of YOLOv8

Here is a closer look at some of the other key features of YOLOv8:

Enhanced detection architecture: YOLOv8 uses an improved CSPDarknet backbone. This backbone is optimized for feature extraction - the process of identifying and capturing important patterns or details from input images that help the model make accurate predictions.
Detection head: It uses an anchor-free, decoupled design, meaning it doesn’t rely on preset bounding box shapes (anchors) and instead learns to predict object locations directly. Due to the decoupled setup, the tasks of classifying what the object is and predicting where it is (regression) are handled separately, which helps improve accuracy and speeds up training.
Balances accuracy and speed: This model achieves impressive accuracy while maintaining fast inference times, making it suitable for both cloud and edge environments.
User-friendly: YOLOv8 is designed to be easy to get started with - you can begin predicting and seeing results in just a few minutes using the Ultralytics Python package.

YOLOv9 focuses on computational efficiency

YOLOv9 was released on February 21, 2024, by Chien-Yao Wang and Hong-Yuan Mark Liao from the Institute of Information Science, Academia Sinica, Taiwan. It supports tasks like object detection and instance segmentation.

This model builds on Ultralytics YOLOv5 and introduces two major innovations: Programmable Gradient Information (PGI) and Generalized Efficient Layer Aggregation Network (GELAN).

PGI helps YOLOv9 retain important information as it processes data through its layers, which leads to more accurate results. Meanwhile, GELAN improves how the model uses its layers, boosting performance and computational efficiency. Thanks to these upgrades, YOLOv9 can handle real-time tasks on edge devices and mobile apps, where computing resources are often limited.

‍

Key features of YOLOv9

Here is a glimpse at some of the other key features of YOLOv8:

High precision with efficiency: YOLOv9 delivers high detection accuracy without consuming a lot of computing power, making it a great choice when resources are limited.
‍
Lightweight models: YOLOv9’s lightweight model variants are optimized for edge and mobile deployments.
‍
Easy to use: YOLOv9 is supported by the Ultralytics Python package, so it’s simple to set up and run in different environments, whether you're using code or the command line.

YOLOv10 enables NMS-free object detection

YOLOv10 was introduced on May 23, 2024, by researchers from Tsinghua University and is focused on real-time object detection. It tackles limitations in earlier YOLO versions by removing the need for non-maximum suppression (NMS), a post-processing step used to eliminate duplicate detections, and refining the overall model design. This results in faster and more efficient object detection, while still achieving state-of-the-art accuracy.

A vital part of what makes this possible is a training approach known as consistent dual-label assignments. It combines two strategies: one that allows multiple predictions to learn from the same object (one-to-many) and another that focuses on choosing the best single prediction (one-to-one). Since both strategies follow the same matching rules, the model learns to avoid duplicates on its own, so NMS isn't required.

‍

YOLOv10’s architecture also uses an improved CSPNet backbone to learn features more effectively and a PAN (Path Aggregation Network) neck that combines information from different layers, making it better at detecting both small and large objects. These improvements make it possible to use YOLOv10 for real-world applications in manufacturing, retail, and autonomous driving.

Key features of YOLOv10

Here are some of the other standout features of YOLOv10:

Large-kernel convolutions: The model uses large-kernel convolutions to capture more context from wider areas of the image, helping it better understand the overall scene.
‍
Partial self-attention modules: The model incorporates partial self-attention modules to focus on the most important parts of the image without using too much computing power, efficiently boosting performance.

Unique model variant: Alongside the usual YOLOv10 sizes - Nano, Small, Medium, Large, and X - there’s also a special version called YOLOv10b (Balanced). It’s a wider model, meaning it processes more features at each layer, which helps improve accuracy while still balancing speed and size.
‍
User-friendly: YOLOv10 is compatible with the Ultralytics Python package, making it easy to use.

Ultralytics YOLO11: Enhanced speed and accuracy

This year, on September 30th, Ultralytics officially launched YOLO11 - one of the latest models in the YOLO series - at its annual hybrid event, YOLO Vision 2024 (YV24).

This release introduced significant improvements over previous versions. YOLO11 is faster, more accurate, and highly efficient. It supports the full range of computer vision tasks that YOLOv8 users are familiar with, including object detection, instance segmentation, and image classification. It also maintains compatibility with YOLOv8 workflows, making it easy for users to transition smoothly to the new version.

On top of this, YOLO11 is designed to meet a wide range of computing needs - from lightweight edge devices to powerful cloud systems. The model is available as both open-source and enterprise versions, making it adaptable for different use cases.

It is a great option for precision tasks like medical imaging and satellite detection, as well as broader applications in autonomous vehicles, agriculture, and healthcare.

‍

Key features of YOLO11

Here are some of the other unique features of YOLO11:

Fast and efficient detection: YOLO11 features a detection head designed for minimal latency, focusing on speed in the final prediction layers without compromising performance.
‍
Improved feature extraction: An optimized backbone and neck architecture enhance feature extraction, leading to more precise predictions.
‍
Seamless deployment across platforms: YOLO11 is optimized to run efficiently on edge devices, cloud platforms, and NVIDIA GPUs, ensuring adaptability across different environments.

Benchmarking YOLO models on the COCO dataset

When exploring different models, it’s not always easy to compare them just by looking at their features. That’s where benchmarking comes in. By running all models on the same dataset, we can objectively measure and compare their performance. Let’s take a look at how each model performs on the COCO dataset.

When comparing YOLO models, each new version brings notable improvements with respect to accuracy, speed, and flexibility. In particular, YOLO11m takes a leap here as it uses 22% fewer parameters than YOLOv8m, which means it’s lighter and faster to run. Also, despite its smaller size, it achieves a higher mean average precision (mAP) on the COCO dataset. This metric measures how well the model detects and localizes objects, so a higher mAP means more accurate predictions.

‍

Testing and comparing YOLO models on a video

Let’s explore how these models perform in a real-world situation.

To compare YOLOv8, YOLOv9, YOLOv10, and YOLO11, all four were run on the same traffic video using a confidence score of 0.3 (the model only displays detections when it is at least 30% confident that it has correctly identified an object) and an image size of 640 for fair evaluation. The object detection and tracking results highlighted key differences in detection accuracy, speed, and precision.

From the first frame, YOLO11 picked up large vehicles like trucks that YOLOv10 missed. YOLOv8 and YOLOv9 showed decent performance but varied depending on lighting conditions and object size. Smaller, distant vehicles remained a challenge across all models, although YOLO11 showed noticeable improvements in those detections as well.

‍

In terms of speed, all models operated between 10 and 20 milliseconds per frame, fast enough to handle real-time tasks at over 50 FPS. On one hand, YOLOv8 and YOLOv9 provided steady and reliable detections throughout the video. Interestingly, YOLOv10, designed for lower latency, was faster but showed some inconsistencies in detecting certain object types.

YOLO11, on the other hand, stood out for its precision, offering a strong balance between speed and accuracy. Although none of the models performed perfectly in every frame, the side-by-side comparison clearly demonstrated that YOLO11 delivered the best overall performance.

Which YOLO model is the best for computer vision tasks?

Selecting a model for a project depends on its specific requirements. For example, some applications may prioritize speed, while others may require higher accuracy or face deployment constraints that influence the decision.

Another important factor is the type of computer vision tasks you need to address. If you're looking for broader flexibility across different tasks, YOLOv8 and YOLO11 are good options.

Whether you choose YOLOv8 or YOLO11 really depends on your needs. YOLOv8 is a solid option if you're new to computer vision and value a larger community, more tutorials, and extensive third-party integrations.

On the other hand, if you're looking for cutting-edge performance with better accuracy and speed, YOLO11 is the better choice, though it comes with a smaller community and fewer integrations due to being a newer release.

Key takeaways

From Ultralytics YOLOv8 to Ultralytics YOLO11, the evolution of the YOLO model series reflects a consistent push toward more intelligent computer vision models. Each version of YOLO brings meaningful upgrades in terms of speed, accuracy, and precision.

As computer vision continues to advance, these models offer reliable solutions to real-world challenges, from object detection to autonomous systems. The ongoing development of YOLO models shows how far the field has come and how much more we can expect in the future.

To learn more about AI, visit our GitHub repository and engage with our community. Discover advancements across industries, from Vision AI in manufacturing to computer vision in healthcare. Check out our licensing options to begin your Vision AI projects today.

Comparing Ultralytics YOLO11 vs previous YOLO models