Glossary

Underfitting

Learn how to identify, prevent, and address underfitting in machine learning models with expert tips, strategies, and real-world examples.

Underfitting is a common issue in machine learning (ML) where a model is too simple to capture the underlying patterns in the training data. This simplicity prevents it from learning the relationship between the input features and the target variable, leading to poor performance on both the data it was trained on and new, unseen data. An underfit model has high bias, meaning it makes strong, often incorrect, assumptions about the data. This results in a model that fails to achieve a high level of accuracy and cannot generalize well.

Underfitting Vs. Overfitting

Underfitting and overfitting are two key challenges in ML that relate to a model's ability to generalize from training data to new data. They represent two extremes on the spectrum of model complexity.

Underfitting: The model is too simple and has high bias. It fails to learn the underlying structure of the data, resulting in a high loss function value and poor performance on both the training and validation datasets.
Overfitting: The model is too complex and has high variance. It learns the training data too well, including the noise and random fluctuations. This results in excellent performance on the training set but poor performance on unseen data, as the model has essentially memorized the training examples instead of learning general patterns.

The ultimate goal in ML is to strike a balance between these two, a concept known as the bias-variance tradeoff, to create a model that generalizes effectively to new, real-world scenarios. Analyzing learning curves is a common method for diagnosing whether a model is underfitting, overfitting, or well-fitted.

Causes and Solutions for Underfitting

Identifying and addressing underfitting is crucial for building effective models. The problem typically stems from a few common causes, each with corresponding solutions.

Model is Too Simple: Using a linear model for a complex, non-linear problem is a classic cause of underfitting.
- Solution: Increase model complexity. This could involve switching to a more powerful model architecture, such as a deeper neural network or a larger pre-trained model like moving from a smaller to a larger Ultralytics YOLO model variant. You can explore various YOLO model comparisons to select a more suitable architecture.
Insufficient or Poor-Quality Features: If the input features provided to the model do not contain enough information to make accurate predictions, the model will underfit.
- Solution: Improve the features through feature engineering. This might involve creating new features, using polynomial features, or applying different data preprocessing techniques to better represent the underlying patterns. Feature selection techniques can also be applied.
Insufficient Training: The model may not have been trained for enough epochs to learn the patterns in the data.
- Solution: Increase the training duration. It's important to monitor validation metrics to ensure that longer training doesn't lead to overfitting. Tools like Ultralytics HUB can help you track and manage your training experiments.
Excessive Regularization: Techniques like L1 and L2 regularization or high dropout rates are used to prevent overfitting, but if they are too aggressive, they can constrain the model too much and cause underfitting.
- Solution: Reduce the amount of regularization. This might mean lowering the penalty term in regularization functions or reducing the dropout rate. Following best practices for model training can help find the right balance.

Real-World Examples of Underfitting

Simple Image Classifier: Imagine training a very basic Convolutional Neural Network (CNN) with only one or two layers on a complex image classification task, such as identifying thousands of object categories in the ImageNet dataset. The model's limited capacity would prevent it from learning the intricate features needed to distinguish between so many classes, resulting in low accuracy on both training and test data. Frameworks like PyTorch and TensorFlow provide the tools to build more sophisticated architectures to overcome this.
Basic Predictive Maintenance: Consider using a simple linear regression model for predictive modeling to estimate when a machine will fail based only on its operating temperature. If machine failures are actually influenced by a complex, non-linear interplay of factors like vibration, age, and pressure, the simple linear model will underfit. It cannot capture the true complexity of the system, leading to poor predictive performance and an inability to anticipate failures accurately. A more complex model, like a gradient boosting machine or a neural network, would be more appropriate.

Underfitting

Flexible enterprise licensing solution to power your innovation

Train AI models in seconds with Ultralytics YOLO

Train YOLO models simply with Ultralytics HUB

Underfitting Vs. Overfitting

Causes and Solutions for Underfitting

Real-World Examples of Underfitting

Read more in this category

Understanding additive manufacturing: Technology & use cases

Monitoring airport ground operations with Ultralytics YOLO11

The evolution and future of robotics in manufacturing

Join the Ultralytics community