Glossary

Underfitting

Discover what underfitting is, its causes, signs, and solutions. Learn how to improve model performance and avoid underfitting issues.

Train YOLO models simply
with Ultralytics HUB

Learn more

Underfitting occurs when a machine learning model is too simple to capture the underlying structure of the data. This typically happens when the model has too few parameters or features relative to the complexity of the data it is trying to learn. As a result, the model fails to adequately learn from the training data and performs poorly not only on the training set but also on unseen data, such as a validation or test set.

Key Characteristics of Underfitting

Underfit models are often characterized by high bias and low variance. Bias refers to the error introduced by approximating a real-world problem, which may be complex, by a simplified model. An underfit model makes overly simplistic assumptions about the data, leading to systematic errors. Variance, on the other hand, refers to the model's sensitivity to fluctuations in the training data. Underfit models exhibit low variance because they are too simple to be affected much by changes in the training data. However, this simplicity also means they cannot capture important patterns and nuances in the data.

Causes of Underfitting

Several factors can contribute to underfitting:

  • Model Complexity: Using a model that is too simple for the complexity of the data. For example, trying to fit a linear model to data with a highly nonlinear relationship.
  • Insufficient Training: Not training the model for enough epochs or using a learning rate that is too high, causing the model to converge prematurely before it can learn the underlying patterns.
  • Poor Feature Engineering: Failing to include relevant features or using features that do not adequately represent the underlying structure of the data. Effective feature engineering is crucial for building models that can generalize well.
  • Over-Regularization: Applying too much regularization, which penalizes model complexity and can prevent the model from learning important patterns.

Detecting Underfitting

Identifying underfitting is essential for improving model performance. Signs of underfitting include:

  • High Training Error: The model performs poorly on the training data, indicating it has not learned the underlying patterns.
  • High Validation/Test Error: Poor performance on validation or test sets, similar to the training error, suggests the model is too simplistic.
  • Learning Curves: Plotting the model's performance on training and validation sets over time can reveal underfitting. If both curves plateau at a high error rate, the model is likely underfitting.

Addressing Underfitting

To combat underfitting, consider the following strategies:

  • Increase Model Complexity: Use a more complex model with more parameters or layers. For example, switch from a linear model to a polynomial model or from a shallow to a deep neural network.
  • Train Longer: Increase the number of training epochs or adjust the learning rate to allow the model more time to learn from the data.
  • Improve Feature Engineering: Add more relevant features or transform existing features to better represent the data's structure. Techniques like creating interaction terms or polynomial features can help.
  • Reduce Regularization: Decrease the amount of regularization applied to the model, allowing it to fit the training data more closely.

Underfitting vs. Overfitting

It is important to distinguish underfitting from overfitting. While underfitting occurs when a model is too simple, overfitting happens when a model is too complex and starts to memorize the training data, including noise and outliers. Overfit models perform exceptionally well on training data but poorly on unseen data. Balancing model complexity and training is crucial to avoid both underfitting and overfitting.

Real-World Examples

Example 1: Predicting House Prices

Imagine you are building a model to predict house prices based on their size. If you use a simple linear regression model and assume that house prices increase linearly with size, you might underfit the data. In reality, the relationship between house size and price is likely more complex, involving factors like diminishing returns for larger sizes or premium prices for certain size ranges. A linear model would fail to capture these nuances, resulting in poor predictive performance on both training and new data.

Example 2: Image Classification

Consider an image classification task where you are trying to classify images of animals into different categories. If you use a very simple model, such as logistic regression, you might underfit the data. Image classification often requires capturing complex patterns and features in images, which a simple model cannot do. As a result, the model would perform poorly on both the training set and new, unseen images. Using a more complex model, like a convolutional neural network (CNN), can significantly improve performance.

By understanding the causes and signs of underfitting, practitioners can take appropriate steps to enhance their models. Tools like Ultralytics YOLOv8 provide advanced capabilities for building and tuning complex models, helping to avoid underfitting and improve performance on various computer vision tasks. For more insights into model training and optimization, visit the Ultralytics Blog.

Read all