Glossary

Residual Networks (ResNet)

Discover how ResNets revolutionize deep learning by solving vanishing gradients, enabling ultradeep networks for image analysis, NLP, and more.

Train YOLO models simply
with Ultralytics HUB

Learn more

Residual Networks, commonly known as ResNet, represent a pivotal deep learning (DL) architecture primarily used in computer vision (CV). Introduced by Kaiming He et al. in their paper "Deep Residual Learning for Image Recognition", ResNet addresses the challenge of training very deep neural networks. Before ResNet, simply stacking more layers in a conventional Convolutional Neural Network (CNN) often led to a problem called degradation, where accuracy would saturate and then quickly degrade, not due to overfitting, but because deeper models became harder to optimize. ResNet's innovation allows for the successful training of networks hundreds or even thousands of layers deep.

How ResNet Works: Residual Connections

The core idea behind ResNet is the introduction of "shortcut connections" or "skip connections". These connections bypass one or more layers and perform identity mapping, adding the output from the previous layer to the layer ahead. This structure helps in tackling the vanishing gradient problem, which often plagues deep networks during training via backpropagation. Instead of forcing layers to learn an optimal mapping directly, ResNet allows them to learn a residual mapping relative to the identity function provided by the skip connection. This makes it easier for the network to learn identity mappings if needed (meaning a block can be effectively skipped if it's not beneficial), simplifying the optimization process for very deep architectures and mitigating the degradation problem observed in plain deep networks.

Applications of ResNet

ResNets have become a foundational architecture in computer vision and are widely used across numerous applications:

  • Image Classification: ResNets achieved state-of-the-art results on image classification benchmarks like ImageNet. Their ability to learn effectively from very deep networks led to significant accuracy improvements for identifying objects and scenes. Many modern architectures use ResNet or its variants as a powerful backbone for feature extraction.
  • Object Detection and Segmentation: Architectures like Ultralytics YOLO often utilize ResNet variants as a backbone for extracting rich features. In object detection, ResNets help accurately locate and classify objects, crucial for applications like analyzing retail shelf layouts or identifying vehicles in traffic monitoring systems. For instance segmentation, they contribute to precise pixel-level object outlining. Explore various object detection architectures to see how ResNet compares.
  • Medical Image Analysis: ResNets are used for tasks such as tumor detection, disease classification from scans, and organ segmentation. For example, in analyzing CT scans, a ResNet-based model can help delineate tumor boundaries for radiation therapy planning. The depth and representational power are essential for capturing subtle patterns, improving diagnostics within AI in healthcare solutions.
  • Facial Recognition: ResNets are employed for robust feature extraction from facial images, enabling accurate identification and verification in security and access control systems.

Advantages of ResNet

The primary advantage of ResNet is its ability to train extremely deep networks effectively, overcoming the degradation and vanishing gradient problems. This depth allows ResNets to learn more complex patterns and hierarchical features from training data, leading to improved performance across various CV tasks. ResNet architectures are also relatively straightforward and serve as a standard component in many modern deep learning models. Their strong performance and adaptability have made them a cornerstone in AI research and application. Users can leverage pre-trained ResNet models for transfer learning or fine-tuning on custom datasets using platforms like Ultralytics HUB to accelerate development.

Read all