용어집

가지 치기

프루닝을 통해 AI 모델을 최적화하여 복잡성을 줄이고 효율성을 높이며 성능 저하 없이 엣지 디바이스에 더 빠르게 배포하세요.

Pruning is a model optimization technique used in artificial intelligence (AI) and machine learning (ML) to reduce the size and computational complexity of trained models. It involves selectively removing parameters, such as weights or connections within a neural network (NN), that are identified as less important or redundant for the model's task. The primary objective is to create smaller, faster models that require less computational resources and memory, ideally without a significant decrease in performance or accuracy. This process is a key part of efficient model deployment, especially on devices with limited capabilities. While "Pruning" is the general term, "Model Pruning" specifically refers to applying this technique to ML models.

가지 치기의 관련성

As deep learning (DL) models grow larger and more complex to tackle sophisticated tasks, their demand for computational power, storage, and energy increases significantly. Pruning directly addresses this challenge by making models more lightweight and efficient. This optimization leads to several benefits: reduced storage needs, lower energy consumption during operation, and decreased inference latency, which is critical for applications requiring real-time inference. Pruning is particularly valuable for deploying models in resource-constrained environments such as mobile devices, embedded systems, and various Edge AI scenarios where efficiency is a primary concern. It can also help mitigate overfitting by simplifying the model.

가지 치기의 응용

Pruning techniques are broadly applied across numerous AI domains. Here are two concrete examples:

Deploying Object Detection Models on Edge Devices: An Ultralytics YOLO model trained for object detection might be too large or slow for deployment on a low-power device like a Raspberry Pi or a Google Edge TPU. Pruning can reduce the model's size and computational load, enabling it to run effectively on such hardware for tasks like security systems or local wildlife monitoring. See guides like the Edge TPU on Raspberry Pi tutorial or the NVIDIA Jetson guide for deployment examples.
Optimizing Models for Autonomous Systems: In autonomous vehicles, complex perception models for tasks like image segmentation or sensor fusion must run with minimal latency. Pruning helps optimize these Convolutional Neural Networks (CNNs) to meet strict real-time processing requirements, ensuring safe and responsive vehicle operation. Frameworks like NVIDIA TensorRT often support pruned models for optimized inference.

유형 및 기술

Pruning methods vary but generally fall into these main categories:

Unstructured Pruning: This involves removing individual weights or neurons based on criteria like low magnitude or contribution to the output. It results in sparse models with irregular patterns of removed connections. While potentially achieving high compression rates, these models may require specialized hardware or software libraries (like Neural Magic's DeepSparse) for efficient execution. See the Ultralytics Neural Magic Integration.
Structured Pruning: This technique removes entire structural components of the network, such as filters, channels, or even layers. This maintains a regular structure, making the pruned model more compatible with standard hardware accelerators and libraries like NVIDIA's structured sparsity support.

Pruning can be implemented at different stages: before training (influencing architecture design), during the training process, or after training on a pre-trained model, often followed by fine-tuning to regain any lost accuracy. Major deep learning frameworks like PyTorch and TensorFlow provide tools and tutorials, such as the PyTorch Pruning Tutorial, to implement various pruning strategies.

가지 치기 대 다른 최적화 기술

Pruning is one of several techniques used for model optimization. It's useful to distinguish it from related concepts:

Model Quantization: Reduces the precision of the model's weights and activations (e.g., from 32-bit floats to 8-bit integers), decreasing model size and often speeding up computation, particularly on specialized hardware.
Knowledge Distillation: Involves training a smaller "student" model to mimic the behavior of a larger, pre-trained "teacher" model, transferring knowledge without inheriting the complexity.

These techniques are not mutually exclusive and are frequently used in combination with pruning to achieve greater levels of optimization. For example, a model might be pruned first, then quantized for maximum efficiency. Optimized models can often be exported to standard formats like ONNX using tools like the Ultralytics export function for broad deployment compatibility across different inference engines.

In summary, pruning is a powerful technique for creating efficient AI models suitable for diverse deployment needs, playing a significant role in the practical application of computer vision (CV) and other ML tasks. Platforms like Ultralytics HUB provide tools and infrastructure, including cloud training, that can facilitate the development and optimization of models like YOLOv8 or YOLO11.

가지 치기

YOLO 모델을 Ultralytics HUB로 간단히
훈련

혁신을 지원하는 유연한 엔터프라이즈 라이선싱 솔루션

다음을 사용하여 몇 초 만에 AI 모델을 훈련하세요. Ultralytics YOLO

Ultralytics HUB로 간단히 YOLO 모델 교육

가지 치기의 관련성

가지 치기의 응용

유형 및 기술

가지 치기 대 다른 최적화 기술

블로그 더 보기

Ultralytics 커뮤니티 가입하기

가지 치기

YOLO 모델을 Ultralytics HUB로 간단히훈련

혁신을 지원하는 유연한 엔터프라이즈 라이선싱 솔루션

다음을 사용하여 몇 초 만에 AI 모델을 훈련하세요. Ultralytics YOLO

Ultralytics HUB로 간단히 YOLO 모델 교육

가지 치기의 관련성

가지 치기의 응용

유형 및 기술

가지 치기 대 다른 최적화 기술

블로그 더 보기

Ultralytics 커뮤니티 가입하기

YOLO 모델을 Ultralytics HUB로 간단히
훈련