Glossary

Vector Database

Discover how vector databases revolutionize AI by enabling efficient similarity searches, semantic search, and anomaly detection for intelligent systems.

A vector database is a specialized type of database designed to store, manage, and search through high-dimensional data known as vector embeddings. Unlike traditional relational databases that are optimized for structured data and exact matches, vector databases excel at finding items based on their similarity. This capability is fundamental for a wide range of modern AI applications, from recommendation engines to visual search, making them a critical component in the machine learning infrastructure. They serve as the long-term memory for AI models, allowing them to leverage the complex patterns learned during training.

How Vector Databases Work

The core function of a vector database is to efficiently execute a vector search. The process begins when unstructured data—such as an image, a block of text, or an audio clip—is passed through a deep learning model to create a numerical representation called a vector embedding. These embeddings capture the semantic meaning of the original data.

The vector database then stores these embeddings and indexes them using specialized algorithms. When a query is made (e.g., searching with an image), the query data is also converted into a vector. The database then compares this query vector to the stored vectors using similarity metrics like Cosine Similarity or Euclidean Distance to find the "nearest" or most similar items. To perform this at scale with millions or billions of vectors, they often rely on highly efficient Approximate Nearest Neighbor (ANN) algorithms.

Real-World Applications

Vector databases power many intelligent features that users interact with daily.

  1. Visual Search in E-commerce: A user can upload a photo of a product they like. A computer vision model, such as an Ultralytics YOLO11 model, generates an embedding for the image. This embedding is used to query the e-commerce site's vector database, which contains embeddings for its entire product catalog. The database returns the most similar vectors, allowing the site to show visually identical or stylistically related products, a key feature in AI for retail.
  2. Semantic Search for Documents: A company can create embeddings for all its internal documents, such as reports and support tickets. An employee can then search using a natural language question like "What were our profits last quarter?" instead of specific keywords. The Natural Language Processing (NLP) model converts this query into an embedding, and the vector database finds the documents whose embeddings are semantically closest, providing relevant information even if the exact phrasing doesn't match. This is a core component of retrieval-augmented generation (RAG) systems.

Join the Ultralytics community

Join the future of AI. Connect, collaborate, and grow with global innovators

Join now
Link copied to clipboard