Glossary

ONNX (Open Neural Network Exchange)

Discover how ONNX enhances AI model portability and interoperability, enabling seamless deployment of Ultralytics YOLO models across diverse platforms.

Train YOLO models simply
with Ultralytics HUB

Learn more

In the rapidly evolving field of artificial intelligence (AI) and machine learning (ML), moving models between different tools and platforms efficiently is crucial. ONNX (Open Neural Network Exchange) addresses this challenge by providing an open-source format designed specifically for AI models. It acts as a universal translator, allowing developers to train a model in one framework, like PyTorch, and then deploy it using another framework or inference engine, such as TensorFlow or specialized runtimes. This interoperability streamlines the path from research to production.

Relevance of ONNX

The core value of ONNX lies in promoting portability and interoperability within the AI ecosystem. Instead of being locked into a specific framework's ecosystem, developers can leverage ONNX to move models freely. By defining a common set of operators and a standard file format, ONNX ensures that a model's structure and learned parameters (weights) are represented consistently. This is particularly beneficial for users of Ultralytics YOLO models, as Ultralytics provides straightforward methods for exporting models to ONNX format. This export capability allows users to take models like YOLOv8 or YOLO11 and deploy them on a wide variety of hardware and software platforms, often utilizing optimized inference engines for enhanced performance.

How ONNX Works

ONNX achieves interoperability through several key features:

  • Common Model Representation: It defines a standard set of operators (like convolution layers or activation functions) and data types. When a model is converted to ONNX, its architecture is translated into this shared language.
  • Graph-Based Structure: Models are represented as computational graphs, where nodes are operations and edges represent the flow of data (tensors). This is a common structure used by most deep learning frameworks.
  • Extensibility: While ONNX defines a core set of operators, it allows for custom operators, enabling support for novel model architectures.
  • Versioning: ONNX maintains operator versions to ensure backward compatibility, meaning models created with older versions can still be used as the standard evolves.

Applications of ONNX

ONNX is widely used to bridge the gap between model training environments and deployment targets. Here are two examples:

  1. Optimized Deployment on Edge Devices: A developer trains an object detection model using Ultralytics YOLO on a powerful server with GPUs. To deploy this model on resource-constrained edge devices, they export the model to ONNX. The ONNX model can then be optimized using tools like NVIDIA TensorRT for NVIDIA hardware or Intel's OpenVINO for Intel CPUs/VPUs, achieving faster and more efficient real-time inference. See our guide on model deployment options for more details.
  2. Cross-Framework Collaboration: A research team develops a novel model component in PyTorch. Another team, responsible for integrating this component into a larger application built with TensorFlow, can receive the component as an ONNX file. This avoids the need for complex code translation or maintaining separate model versions for different frameworks, fostering easier collaboration within organizations like those listed on our customers page.
Read all