Glossary

Parameter-Efficient Fine-Tuning (PEFT)

Discover Parameter-Efficient Fine-Tuning (PEFT) for adapting large AI models with minimal resources. Save costs, prevent overfitting, and optimize deployment!

Train YOLO models simply
with Ultralytics HUB

Learn more

Parameter-Efficient Fine-Tuning (PEFT) describes a collection of techniques used in machine learning (ML) to adapt large, pre-trained models (like foundation models) to specific downstream tasks without needing to update all of the model's parameters. Instead, PEFT methods focus on modifying only a small subset of parameters or adding a small number of new parameters. This approach drastically reduces the computational and storage costs associated with fine-tuning massive models, such as large language models (LLMs) or large-scale vision models used in computer vision (CV), making customization more accessible and efficient.

Relevance and Benefits

The rise of extremely large pre-trained models, often containing billions of parameters, has made traditional fine-tuning methods resource-intensive. Fully fine-tuning such models requires significant computational power (often multiple high-end GPUs), large amounts of memory, and considerable storage space for each adapted model. PEFT addresses these challenges by offering several key benefits:

  • Reduced Computational Cost: Training only a small fraction of parameters requires significantly less computing power and time, enabling faster iteration and experimentation, potentially using platforms like Ultralytics HUB Cloud Training.
  • Lower Memory Requirements: Fewer active parameters mean less memory is needed during training and inference, making it feasible to fine-tune large models on consumer-grade hardware or edge devices.
  • Smaller Storage Footprint: Instead of saving a full copy of the fine-tuned model for each task, PEFT often only requires storing the small set of modified or added parameters, leading to substantial storage savings.
  • Mitigation of Overfitting: By limiting the number of trainable parameters, PEFT can reduce the risk of overfitting, especially when fine-tuning on smaller datasets.
  • Prevention of Catastrophic Forgetting: PEFT methods, by keeping most of the base model parameters frozen, help retain the general knowledge learned during pre-training, overcoming catastrophic forgetting where a model loses previous capabilities when learning new tasks.
  • Efficient Model Deployment: The smaller size of the task-specific parameters makes model deployment simpler, especially in resource-constrained environments like edge AI.

Key Concepts and Techniques

PEFT builds upon the concept of transfer learning, where knowledge from a base model is applied to a new task. While standard fine-tuning adjusts many (or all) layers, PEFT employs specialized methods. Some popular PEFT techniques include:

  • Adapters: Small neural network modules inserted between the layers of a pre-trained model. Only the parameters of these adapter modules are trained during fine-tuning, while the original model weights remain frozen.
  • LoRA (Low-Rank Adaptation): This technique injects trainable low-rank matrices into the layers (often Transformer layers) of a large model. It hypothesizes that the change needed to adapt the model has a low "intrinsic rank" and can be represented efficiently. Read the original LoRA research paper for details.
  • Prefix-Tuning: Prepends a sequence of continuous, task-specific vectors (prefixes) to the input, keeping the base LLM parameters frozen. Only the prefix parameters are learned.
  • Prompt Tuning: Similar to Prefix-Tuning, but simplifies it by adding trainable "soft prompts" (embeddings) to the input sequence, which are optimized directly through backpropagation.

Libraries like the Hugging Face PEFT library provide implementations of various PEFT methods, making them easier to integrate into common ML workflows.

Real-World Applications

PEFT enables the practical application of large models across various domains:

In essence, Parameter-Efficient Fine-Tuning makes state-of-the-art AI models like the Ultralytics YOLO models more versatile and cost-effective to adapt for a wide array of specific applications, democratizing access to powerful AI capabilities.

Read all