Glossary

Image Segmentation

Discover the power of image segmentation with Ultralytics YOLO. Explore pixel-level precision, types, applications, and real-world AI use cases.

Train YOLO models simply
with Ultralytics HUB

Learn more

Image segmentation is a fundamental technique in computer vision (CV) that involves partitioning a digital image into multiple distinct regions or segments. The primary goal is to assign a class label to every pixel in the image, essentially simplifying the image representation into something more meaningful and easier for machines to analyze. Unlike object detection, which identifies objects using rectangular bounding boxes, image segmentation provides a much more granular, pixel-level understanding of the image content, outlining the exact shape of objects. This precision is crucial for tasks demanding detailed spatial awareness.

How Image Segmentation Works

Image segmentation algorithms work by examining an image pixel by pixel and grouping pixels that share certain characteristics—such as color, intensity, texture, or spatial location—into segments. Early methods relied on techniques like thresholding, region growing, and clustering (K-Means, DBSCAN). However, modern approaches heavily leverage deep learning (DL), particularly Convolutional Neural Networks (CNNs). These neural networks learn complex hierarchical features directly from training data to perform pixel-wise classification. The typical output is a segmentation mask, an image where each pixel's value corresponds to the class label it belongs to, visually highlighting the precise boundaries of objects or regions. Frameworks like PyTorch and TensorFlow are commonly used to build and train these models.

Types of Image Segmentation

Image segmentation tasks can vary based on how objects and classes are handled:

  • Semantic Segmentation: Assigns each pixel to a predefined category (e.g., 'car', 'road', 'sky'). It does not distinguish between different instances of the same object class. All cars, for example, would share the same label.
  • Instance Segmentation: Goes a step further than semantic segmentation by identifying and delineating each individual object instance within an image. Each separate car would get a unique identifier or mask, even if they belong to the same class. This is particularly useful when counting or tracking individual objects is necessary.
  • Panoptic Segmentation: Combines semantic and instance segmentation. It assigns a class label to every pixel (like semantic segmentation) and uniquely identifies each object instance (like instance segmentation). It provides a comprehensive, unified understanding of the scene.

Real-World Applications

The detailed analysis provided by image segmentation enables numerous applications:

Image Segmentation and Ultralytics YOLO

Ultralytics YOLO models, such as YOLOv8 and YOLO11, provide state-of-the-art performance for instance segmentation tasks, balancing speed and accuracy for real-time inference. The Ultralytics framework simplifies the process of training custom segmentation models on datasets like COCO or specialized datasets such as car parts or crack segmentation. Tools like Ultralytics HUB offer a streamlined platform for managing datasets, training models (cloud training available), and deploying them. You can explore the segmentation task documentation for implementation details or follow guides like segmentation with pre-trained YOLOv8 models or image segmentation with YOLO11 on Google Colab.

Read all