Glossary

ONNX (Open Neural Network Exchange)

Discover how ONNX enhances AI model portability and interoperability, enabling seamless deployment of Ultralytics YOLO models across diverse platforms.

Train YOLO models simply
with Ultralytics HUB

Learn more

In the rapidly evolving field of artificial intelligence (AI) and machine learning (ML), moving models between different tools and platforms efficiently is crucial. ONNX (Open Neural Network Exchange) addresses this challenge by providing an open-source format designed specifically for AI models. It acts as a universal translator, allowing developers to train a model in one framework, like PyTorch, and then deploy it using another framework or inference engine, such as TensorFlow or specialized runtimes like ONNX Runtime. This interoperability streamlines the path from research to production, fostering collaboration and flexibility within the AI ecosystem. ONNX was initially developed by Facebook AI Research and Microsoft Research and is now a thriving community project.

Relevance of ONNX

The core value of ONNX lies in promoting portability and interoperability within the AI development lifecycle. Instead of being locked into a specific framework's ecosystem, developers can leverage ONNX to move models freely between different tools and hardware platforms. By defining a common set of operators (the building blocks of neural networks) and a standard file format (.onnx), ONNX ensures that a model's structure and learned parameters (weights) are represented consistently. This is particularly beneficial for users of Ultralytics YOLO models, as Ultralytics provides straightforward methods for exporting models to ONNX format. This export capability allows users to take models like YOLOv8 or the latest YOLO11 and deploy them on a wide variety of hardware and software platforms, often utilizing optimized inference engines for enhanced performance and hardware acceleration.

How ONNX Works

ONNX achieves interoperability through several key technical features:

  • Common Model Representation: ONNX defines a standard set of computational graph operators, such as convolution or activation functions, and a common data type system. When a deep learning model is converted to ONNX, its architecture and parameters are translated into this shared representation.
  • Graph-based Structure: Models in ONNX are represented as computational graphs. Nodes in the graph represent operations (like matrix multiplication or applying a ReLU function), while edges represent the flow of data (tensors) between these operations. This graph structure is common across many ML frameworks, facilitating easier conversion.
  • Versioning System: ONNX maintains versions for its operator sets (opsets). This ensures backward compatibility, allowing models created with older opsets to still run on newer runtimes that support those versions.
  • Extensibility: While ONNX defines a core set of operators, it also allows for custom operators, enabling frameworks and hardware vendors to support specialized functionalities.
  • Ecosystem and Tools: A rich ecosystem surrounds ONNX, including libraries for converting models from various frameworks (like PyTorch or TensorFlow), tools for visualizing and debugging ONNX graphs, and runtimes like ONNX Runtime optimized for high-performance inference across different hardware (CPU, GPU, specialized accelerators).

Applications of ONNX

ONNX serves as a crucial bridge between model training environments and diverse deployment targets. Here are two concrete examples:

  1. Deploying Computer Vision Models on Edge Devices: A developer trains an object detection model, such as an Ultralytics YOLO model, using PyTorch on a powerful server with GPUs. For deployment on resource-constrained edge devices (like a smart camera or a drone), they export the model to ONNX format. This ONNX file can then be optimized using tools like NVIDIA TensorRT or Intel's OpenVINO and deployed for efficient, real-time inference directly on the device. This flexibility is highlighted in various model deployment options. You can explore Ultralytics solutions for examples in different industries.
  2. Cross-Framework Collaboration and Deployment: A research team develops a novel model architecture using TensorFlow. Another team wants to integrate this model into an existing application built with PyTorch. By exporting the TensorFlow model to ONNX, the second team can easily load and use it within their PyTorch environment or deploy it using the standardized ONNX Runtime across different server configurations (cloud or on-premise) without needing the original TensorFlow framework. This promotes easier model serving and integration.
Read all