Glossary

Accuracy

Discover the importance of accuracy in machine learning, its calculation, limitations with imbalanced datasets, and ways to improve model performance.

Train YOLO models simply
with Ultralytics HUB

Learn more

Accuracy is a fundamental performance metric in machine learning (ML), particularly for classification tasks. It measures the proportion of total predictions that a model correctly identified. Simply put, it answers the question: "Out of all the predictions made, how many were actually correct?" It provides a straightforward, high-level overview of a model's overall performance on a given dataset.

Understanding Accuracy

Accuracy is calculated by dividing the number of correct predictions (both true positives and true negatives) by the total number of predictions made. While intuitive, accuracy alone can sometimes be misleading, especially when dealing with imbalanced datasets – situations where one class significantly outnumbers others. For instance, if a dataset contains 95% non-spam emails and 5% spam emails, a model that simply predicts "not spam" for every email would achieve 95% accuracy, despite being useless for identifying actual spam. Therefore, it's crucial to consider accuracy alongside other evaluation metrics for a complete picture of model performance. You can gain more insights into model evaluation and fine-tuning strategies.

Accuracy vs. Other Metrics

It's important to distinguish accuracy from related metrics:

  • Precision: Measures the proportion of positive identifications that were actually correct. It answers: "Of all the items predicted as positive, how many truly were positive?" High precision is crucial when the cost of a false positive is high.
  • Recall (Sensitivity): Measures the proportion of actual positives that were correctly identified. It answers: "Of all the true positive items, how many did the model correctly identify?" High recall is vital when missing a positive case (false negative) is costly.
  • F1-Score: The harmonic mean of Precision and Recall, providing a single score that balances both metrics. It's particularly useful when dealing with imbalanced classes.
  • Mean Average Precision (mAP): A common metric in object detection tasks, like those performed by Ultralytics YOLO models, which considers both classification correctness and localization accuracy (Intersection over Union - IoU). Simple accuracy isn't suitable here as it doesn't account for the bounding box placement.

These metrics are often derived from a Confusion Matrix, which provides a detailed breakdown of correct and incorrect classifications for each class. Understanding these YOLO performance metrics is essential.

Real-World Applications and Examples

Accuracy serves as a baseline metric in many applications:

  1. Spam Email Filtering: In classifying emails as 'spam' or 'not spam', accuracy indicates the overall percentage of emails correctly categorized. However, due to the typically low percentage of spam emails (an imbalanced dataset problem), relying solely on accuracy could be deceptive. Metrics like precision and recall are often more informative here to ensure spam is caught without flagging legitimate emails incorrectly.
  2. Medical Image Analysis: Consider an AI in healthcare model designed to classify medical scans as showing a tumor ('positive') or not ('negative'). While overall accuracy is important, recall becomes critical. Missing a tumor (a false negative) can have severe consequences, so ensuring the model identifies as many actual tumor cases as possible is paramount, even if it means lower precision (more false positives that require further review). Platforms like Ultralytics HUB can help manage the training and evaluation process for such models.

Accuracy in Ultralytics

Within the Ultralytics ecosystem, accuracy is one of several metrics used to evaluate model performance, especially for image classification tasks. When comparing models, such as YOLO11 vs YOLOv8, accuracy benchmarks on standard datasets like ImageNet provide valuable comparison points, alongside inference speed and computational cost. However, for detection and segmentation tasks, metrics like mAP are prioritized as they better reflect the specific challenges of those tasks.

Read all