Glossary

Transfer Learning

Unlock the power of transfer learning to save time, boost AI performance, and tackle new tasks with limited data using pre-trained models.

Train YOLO models simply
with Ultralytics HUB

Learn more

Transfer learning is a Machine Learning (ML) technique where a model developed for one task is reused as the starting point for a model on a second, related task. Instead of building a model from scratch, transfer learning leverages the knowledge (features, weights) learned from a source task, significantly reducing the training time, data requirements, and computational resources needed for the target task. This approach is especially beneficial when the target task has limited labeled data.

How Transfer Learning Works

The process typically begins with a model pre-trained on a large, general dataset, such as ImageNet for image tasks or large text corpora for Natural Language Processing (NLP). This pre-training allows the model, often a Deep Learning (DL) model like a Convolutional Neural Network (CNN) or a Transformer, to learn general features—edges, textures, patterns in images, or grammar and semantics in text.

For the new target task, this pre-trained model is adapted. Common strategies include:

  1. Using the Pre-trained Model as a Feature Extractor: The initial layers of the pre-trained model are kept frozen (their weights are not updated), and only the final classifier or task-specific layers are trained on the new dataset.
  2. Fine-tuning: This involves unfreezing some or all of the pre-trained layers and continuing the training process (backpropagation) on the new dataset, typically with a lower learning rate to avoid drastically altering the learned features. Fine-tuning allows the model to specialize its general knowledge for the specific nuances of the target task.

Benefits of Transfer Learning

Employing transfer learning offers several key advantages:

  • Reduced Data Needs: Achieves good performance even with smaller target datasets.
  • Faster Development: Significantly cuts down model training time.
  • Improved Performance: Often leads to higher accuracy and better generalization compared to training from scratch, especially on complex tasks.
  • Resource Efficiency: Saves computational costs (GPU time, energy) associated with extensive training.

Real-World Applications

Transfer learning is widely applied across various domains:

  1. Computer Vision: Models like Ultralytics YOLO, pre-trained on large datasets such as COCO, are frequently adapted for specialized object detection, image segmentation, or image classification tasks. For example, a model pre-trained on everyday objects can be fine-tuned for specific applications like medical image analysis to detect anomalies (tumor detection) or for AI in agriculture to identify specific crops or pests. You can learn how to apply transfer learning with YOLOv5 by freezing layers.
  2. Natural Language Processing (NLP): Large Language Models (LLMs) like BERT and GPT are pre-trained on massive text datasets. They serve as powerful base models that can be fine-tuned for specific NLP tasks such as sentiment analysis, named entity recognition (NER), or building specialized chatbots. Libraries like Hugging Face Transformers greatly facilitate this process.

Platforms like Ultralytics HUB simplify the process of applying transfer learning by providing pre-trained models (YOLOv8, YOLOv11) and tools for easy custom training on user-specific datasets. Frameworks like PyTorch and TensorFlow also offer extensive support and tutorials for transfer learning. For a deeper dive, explore resources like the Stanford CS231n overview or academic surveys like "A Survey on Deep Transfer Learning".

Read all