Meet YOLO26: next-gen vision AI.
Ultralytics
Back to Ultralytics Glossary

GELU (Gaussian Error Linear Unit)

Explore the Gaussian Error Linear Unit (GELU) activation function. Learn how its smooth, probabilistic nonlinearity powers Transformers, BERT, and modern AI.

The Gaussian Error Linear Unit (GELU) is a sophisticated activation function that plays a pivotal role in the performance of modern artificial intelligence (AI) systems, particularly those based on the Transformer architecture. Unlike traditional functions that apply a rigid, deterministic threshold to neuron inputs, GELU introduces a probabilistic aspect inspired by the properties of the Gaussian distribution. By weighing inputs by their magnitude rather than simply gating them, GELU provides a smoother nonlinearity that aids in the optimization of deep learning (DL) models. This unique characteristic allows networks to model complex data patterns more effectively, contributing significantly to the success of massive foundation models.

Link to this sectionHow GELU Works#

At the core of any neural network, activation functions determine whether a neuron "fires" based on its input signal. Older functions like the Rectified Linear Unit (ReLU) operate like a switch, outputting zero for any negative input and the input itself for positive values. While efficient, this sharp cutoff can hinder training dynamics.

GELU improves upon this by scaling the input by the cumulative distribution function of a Gaussian distribution. Intuitively, this means that as the input value decreases, the probability of the neuron dropping out increases, but it happens gradually rather than abruptly. This curvature creates a smooth, non-monotonic function that is differentiable at all points. This smoothness facilitates better backpropagation of gradients, helping to mitigate issues like the vanishing gradient problem which can stall the training of deep networks.

Link to this sectionReal-World Applications#

The smoother optimization landscape provided by GELU has made it the default choice for some of the most advanced applications in machine learning (ML).

Understanding GELU often requires distinguishing it from other popular activation functions found in the Ultralytics glossary.

  • GELU vs. ReLU: ReLU is computationally simpler and creates sparsity (exact zeros), which can be efficient. However, the "sharp corner" at zero can slow down convergence. GELU offers a smooth approximation that typically yields higher accuracy in complex tasks, albeit with a slightly higher computational cost.
  • GELU vs. SiLU (Swish): The Sigmoid Linear Unit (SiLU) is structurally very similar to GELU and shares its smooth, non-monotonic properties. While GELU is dominant in Natural Language Processing (NLP), SiLU is frequently preferred in highly optimized object detectors like YOLO26 due to its efficiency on edge hardware and excellent performance in detection tasks.
  • GELU vs. Leaky ReLU: Leaky ReLU attempts to fix the "dying neuron" problem of standard ReLU by allowing a small, constant linear slope for negative inputs. In contrast, GELU is non-linear for negative values, offering a more complex and adaptive response that often leads to better representation learning in very deep networks.

Link to this sectionImplementation Example#

Implementing GELU is straightforward using modern deep learning libraries like PyTorch. The following example demonstrates how to apply the function to a tensor of input data.

import torch
import torch.nn as nn

# Initialize the GELU activation function
gelu_activation = nn.GELU()

# Create sample input data including negative and positive values
input_data = torch.tensor([-3.0, -1.0, 0.0, 1.0, 3.0])

# Apply GELU to the inputs
output = gelu_activation(input_data)

# Print results to see the smoothing effect on negative values
print(f"Input: {input_data}")
print(f"Output: {output}")

For developers looking to leverage these advanced activation functions in their own computer vision projects, the Ultralytics Platform simplifies the entire workflow. It provides a unified interface to annotate data, train models using architectures like YOLO26 (which utilizes optimized activations like SiLU), and deploy them efficiently to the cloud or edge devices.

Explore solutions

Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more

Let's build the future of AI together!

Begin your journey with the future of machine learning