Glossary

Transfer Learning

Unlock the power of transfer learning to save time, boost AI performance, and tackle new tasks with limited data using pre-trained models.

Train YOLO models simply
with Ultralytics HUB

Learn more

Transfer learning is a machine learning (ML) technique where a model developed for a specific task is reused as the starting point for a model on a second, related task. Instead of building a model from scratch, which requires significant data and computational resources, transfer learning leverages the knowledge (features, patterns, and weights) learned from a source task to improve learning on a target task. This approach is particularly beneficial when the target task has limited labeled data, significantly accelerating the training process and often leading to better performance compared to training only on the target dataset.

How Transfer Learning Works

The core idea behind transfer learning is that a model trained on a large and general dataset, like ImageNet for image tasks or a massive text corpus for Natural Language Processing (NLP), learns general features that are useful for many other related tasks. For instance, in computer vision (CV), initial layers of a Convolutional Neural Network (CNN) might learn to detect edges, textures, and simple shapes, which are fundamental visual elements applicable across various image recognition problems.

When applying transfer learning, you typically start with a pre-trained model. Depending on the similarity between the source and target tasks and the size of the target dataset, you might:

  1. Use the Pre-trained Model as a Feature Extractor: Freeze the weights of the initial layers (the backbone) and only train the final classification or detection layers on the new dataset. This is common when the target dataset is small. An example is using YOLOv5 by freezing layers.
  2. Fine-tune the Pre-trained Model: Unfreeze some or all of the pre-trained layers and continue training them on the new dataset, typically with a lower learning rate. This allows the model to adapt the learned features more specifically to the nuances of the target task. This is a common strategy when the target dataset is larger. Fine-tuning is often considered a specific type of transfer learning.

Real-World Applications

Transfer learning is widely applied across various domains:

  • Computer Vision:
  • Natural Language Processing (NLP):
    • Sentiment Analysis: Fine-tuning large language models like BERT or GPT, which are pre-trained on vast amounts of text data, to classify the sentiment of specific types of text (e.g., product reviews, social media posts). Hugging Face Transformers provides many such pre-trained models.
    • Named Entity Recognition (NER): Adapting pre-trained language models to identify specific entities (like names, locations, organizations) within domain-specific texts (e.g., legal documents, medical records).
    • Chatbots: Using pre-trained language models as a base to build conversational agents capable of understanding and responding to user queries in specific domains.

Tools and Frameworks

Platforms like Ultralytics HUB simplify the process of applying transfer learning by providing pre-trained models (like Ultralytics YOLOv8 and YOLO11) and tools for easy custom training on user-specific datasets. Frameworks like PyTorch and TensorFlow also offer extensive support and tutorials for implementing transfer learning workflows. For a deeper theoretical understanding, resources like the Stanford CS231n overview on transfer learning or academic surveys like "A Survey on Deep Transfer Learning" provide valuable insights.

Read all