Khám phá cách suy luận thời gian thực với Ultralytics YOLO cho phép dự đoán tức thời cho các ứng dụng AI như hệ thống lái xe tự động và an ninh.
Real-time inference refers to the process where a trained machine learning (ML) model makes predictions or decisions immediately as new data arrives. Unlike batch inference, which processes data in groups collected over time, real-time inference prioritizes low latency and instant responses. This capability is essential for applications requiring immediate feedback or action based on live data streams, enabling systems to react dynamically to changing conditions, aligning with the principles of real-time computing.
In practice, real-time inference means deploying an ML model, such as an Ultralytics YOLO model for computer vision (CV), so it can analyze individual data inputs (like video frames or sensor readings) and produce outputs with minimal delay. The key performance metric is inference latency, the time taken from receiving an input to generating a prediction. Achieving low latency often involves several strategies, including optimizing the model itself and leveraging specialized hardware and software.
Sự khác biệt chính nằm ở cách xử lý dữ liệu và các yêu cầu về độ trễ liên quan:
Real-time inference powers many modern Artificial Intelligence (AI) applications where instantaneous decision-making is crucial:
Making models run fast enough for real-time applications often requires significant optimization:
Models like Ultralytics YOLO11 are designed with efficiency and accuracy in mind, making them well-suited for real-time object detection tasks. Platforms like Ultralytics HUB provide tools to train, optimize (e.g., export to ONNX or TensorRT formats), and deploy models, facilitating the implementation of real-time inference solutions across various deployment options.