Green check
Link copied to clipboard

How to use the Reference section of the Ultralytics YOLO Docs

Learn how to use the Reference section of the Ultralytics YOLO Docs to understand what is under the hood of the Ultralytics Python package.

Nowadays, artificial intelligence (AI) is more accessible than ever before, making it possible for anyone to dive in and quickly start using different AI models for various cutting-edge tasks. 

For example, computer vision is a branch of AI that enables computers to interpret and understand visual information from images and videos, and computer vision models like Ultralytics YOLO11 are easy to get started with.

YOLO11 supports tasks like object detection, instance segmentation, and image classification and can be used for applications like autonomous driving, security monitoring, and retail analytics.

__wf_reserved_inherit
Fig 1. YOLO11 can be used to detect various objects.

Specifically, the Ultralytics Python package provides user-friendly tools to quickly train, customize, and deploy these AI models, allowing users of all skill levels to easily build advanced computer vision applications. 

However, if you're interested in diving deeper into how everything works or creating your own customizations, the Reference section of the Ultralytics documentation is a great resource. It covers the inner workings of the Ultralytics Python package, including how your data is handled, the model training process, and how you can visualize predictions.

In this article, we’ll take a closer look at the Reference section of the Ultralytics documentation and how to use it when working on computer vision projects. Let’s get started!

A deeper look at working with Ultralytics YOLO models

Working with the Ultralytics Python package is simple and straightforward. You can train YOLO models or detect objects in images using just a few lines of code. 

However, once you get familiar with working with computer vision models, the Reference section of the Ultralytics documentation helps you take a deeper look into how the code works and what functions the package supports. It also includes easy-to-follow explanations, configurable options, and links to relevant code available in the Ultralytics GitHub repository.

It explains how the Ultralytics Python package is structured and covers key components like model setup, data loading, the training process, and how predictions are made and returned. 

Everything is organized into clear categories, so it's easy to find what you're looking for. For instance, if you're training a model with your own dataset, you can go to the part of the Reference section focused on data, and it’ll give you a better idea of how your data will be used for model training.  

How to get started with the Reference section

If you head to the Reference section in the Ultralytics YOLO documentation, you'll find a menu on the left side of the page with different reference categories. Each category represents a specific part of the Ultralytics codebase, such as models, data handling, or training functions. 

Clicking on a category takes you to a page that gives you more details.

__wf_reserved_inherit
Fig 2.  On the left, you will find a menu of different Reference categories.

Similarly, on the right side of the page, you’ll find the table of contents that breaks down each reference page into key components like functions (reusable blocks of code), classes (blueprints for creating objects), and methods (functions defined inside classes). This makes it easy to jump straight to what you're looking for.

__wf_reserved_inherit
Fig 3. On the right, you'll find a table of contents for the specific Reference page you're viewing.

The structure of the Ultralytics GitHub repository 

The Ultralytics GitHub repository is organized into sub-directories or subpackages based on different parts of the Ultralytics package, such as models, training, and data. The Reference section in the documentation follows this same structure, which makes it easier to understand how everything fits together.

Here are some of the main subdirectories or categories you'll see in both the Ultralytics GitHub repository and the Reference section of the Ultralytics documentation:

  • Models: This section focuses on different models and their modes, such as making predictions, validating performance, and exporting trained models.

  • Engine: It contains the core logic for training, validating, predicting, exporting, and evaluating models.

  • Data: It manages how datasets are loaded, processed, and augmented. This includes functions for creating dataloaders (tools that feed data into the model in batches), applying transformations (changes made to images like resizing or flipping to help the model learn better), and preparing data (organizing and formatting the images and labels) for training.

  • Utils: This section provides a wide range of helper functions used across the codebase, such as visualization tools, file handling, and metric calculations.

  • HUB: It connects to Ultralytics HUB, a no-code computer vision platform, enabling cloud features like logging in, uploading models, and managing datasets through an API.

  • Trackers: It implements object-tracking logic for applications involving video or frame-by-frame image sequences.

Each of these sub-directories in the GitHub repository has a corresponding section in the documentation. This structure is intentionally mirrored, making it easier to switch between reading the documentation and exploring the source code.

In fact, in many of the Reference pages, the actual source code is also displayed, so you can see exactly how functions and classes are implemented without leaving the documentation.

__wf_reserved_inherit
Fig 4. The source code is also included in the Reference pages.

Understanding the models, engine, and data components

Now that we've seen how the Reference section is organized, let’s take a closer look at three key parts of the Ultralytics package: models, engine, and data.

The models subdirectory contains the code that defines how each type of model works. It's organized by both model types (like YOLO, FastSAM, or RT-DETR) and tasks such as detection, segmentation, or classification. Inside each of these, you'll find files or modules that handle specific actions - for example, how the model makes predictions, how it gets trained, or how its performance is evaluated.

Meanwhile, the engine subdirectory works behind the scenes to manage the entire process. While the models subdirectory focuses on what each model is supposed to do, the engine subdirectory focuses on how to actually run those tasks in a consistent and efficient way. 

Also, the data subdirectory is responsible for loading and preparing datasets. This part of the codebase ensures that your training data is clean, structured, and varied, helping the model learn better and generalize more effectively.

This clear separation makes the code easier to maintain, and it gives users the flexibility to customize it.

Examples of using the Reference section

You might be wondering, why is it important to understand the different parts of the Ultralytics codebase? If you know which part of the code handles what, it becomes much easier to find the information you need, make changes, or troubleshoot problems. 

Here are some examples of how you can use the Reference section of the documentation:

  • If you're asking, “How does the model make predictions?”, you can go to the Models category in the Reference section, select a model type (like YOLO), pick a task (such as detect), and then open the Predict page for details. 
  • If you want to know how data augmentations are being applied, you can explore the Augment page under the Data category. It lists the built-in augmentation techniques used to improve model performance and variety in training data.

Exploring results through the Reference section

The Reference section is also helpful when you're trying to understand the outputs returned by your model. After a model like YOLO11 is used to run an inference on an image, it returns a set of results that describe what was detected. 

For example, in a camera feed, it might detect a person and highlight their location using a bounding box, along with a confidence score - a value between 0 and 1 that indicates how certain the model is about the detection.

If you're trying to understand how to use that output in your project, the Reference Section can guide you. It includes a page for the Results module that breaks down what’s included and how to access it in your code. There are details on how to view detection boxes, check confidence scores, display results, or save them.

__wf_reserved_inherit
Fig 5. An example of how results returned by YOLO11 can be visualized.

Key takeaways

The Ultralytics documentation helps you understand how to use YOLO models effectively. It explains key processes such as training models, preparing data, and working with results. Each page has clear explanations and example code snippets to help you get started quickly.

If you're curious about what happens behind the scenes, the Reference section of the documentation also breaks it down step by step. It shows how the code is structured, what each part does, and how everything works together. This makes it easier to learn, customize, and confidently build your own computer vision projects.

Be part of our active community and explore the GitHub repository to learn more about building with AI. Ready to launch your own computer vision ideas? Visit our licensing options to get started. See how Vision AI in automotive and AI in healthcare is making an impact by visiting our solutions pages.

Facebook logoTwitter logoLinkedIn logoCopy-link symbol

Read more in this category

Let’s build the future
of AI together!

Begin your journey with the future of machine learning