Glossary

Recall

Learn what Recall is in machine learning, why it matters, and how it ensures AI models capture critical positive instances effectively.

Recall, also known as sensitivity or the true positive rate, is a fundamental evaluation metric in machine learning (ML) and statistics. It measures a model's ability to correctly identify all relevant instances within a dataset. In simple terms, Recall answers the question: "Of all the actual positive instances, how many did the model correctly predict as positive?" A high Recall score indicates that the model is effective at finding what it's supposed to find, minimizing the number of missed positive cases (false negatives). This metric is particularly critical in applications where failing to detect a positive case has significant consequences.

The Importance of High Recall

In many real-world scenarios, the cost of a false negative (missing a detection) is much higher than the cost of a false positive (a false alarm). This is where prioritizing high Recall becomes essential. For example, in tasks like medical image analysis or fraud detection, a high-Recall model ensures that as many true cases as possible are captured for further review, even if it means some non-cases are incorrectly flagged.

  • Medical Diagnosis: In an AI-powered system for detecting cancer from medical scans, a high-Recall model is crucial. It is far better to have the system flag a healthy patient for review by a radiologist (a false positive) than to miss a cancerous tumor (a false negative), which could delay life-saving treatment. Many AI in healthcare solutions are optimized for high sensitivity.
  • Security and Surveillance: For a security alarm system designed to detect intruders, high Recall is paramount. The system must identify every potential threat, even if it occasionally mistakes a stray animal for an intruder. Missing a genuine security breach would render the system ineffective.

Recall In Ultralytics YOLO Models

In the context of computer vision (CV) and models like Ultralytics YOLO, Recall is a key metric used alongside Precision and mean Average Precision (mAP) to evaluate performance on tasks like object detection and instance segmentation. Achieving a good balance between Recall and Precision is often essential for robust real-world performance. For instance, when comparing models like YOLOv8 vs YOLO11, Recall helps understand how well each model identifies all target objects. Users can train custom models using frameworks like PyTorch or TensorFlow and track Recall using tools like Weights & Biases or the integrated features in Ultralytics HUB. Understanding Recall helps optimize models for specific use cases, potentially involving hyperparameter tuning or exploring different model architectures like YOLOv10 or the latest YOLO11. Resources like the Ultralytics documentation offer comprehensive guides on training and evaluation.

Recall vs. Other Metrics

It is important to differentiate Recall from other common evaluation metrics.

  • Precision: While Recall focuses on finding all positive samples, Precision measures the accuracy of the positive predictions made. It answers: "Of all the instances the model predicted as positive, how many were actually positive?" There is often a trade-off between Precision and Recall; increasing one may decrease the other. This concept is known as the Precision-Recall tradeoff.
  • Accuracy: Measures the overall percentage of correct predictions (both positive and negative). Accuracy can be a misleading metric for imbalanced datasets, where one class vastly outnumbers the other. For example, in a dataset with 99% negative samples, a model that predicts everything as negative achieves 99% accuracy but has zero Recall for the positive class.
  • F1-Score: This is the harmonic mean of Precision and Recall. The F1-Score provides a single number that balances both metrics, making it a useful measure when you need to consider both false positives and false negatives. It is often used when there is an uneven class distribution.
  • Area Under the Curve (AUC): Specifically for binary classification, the Receiver Operating Characteristic (ROC) curve plots the true positive rate (Recall) against the false positive rate. The AUC provides a single score summarizing the model's performance across all classification thresholds. The area under the Precision-Recall curve (AUC-PR) is often more informative for imbalanced classification tasks.

Join the Ultralytics community

Join the future of AI. Connect, collaborate, and grow with global innovators

Join now
Link copied to clipboard