Glossary

Prompt Enrichment

Master AI with prompt enrichment! Enhance Large Language Models' outputs using context, clear instructions, and examples for precise results.

Train YOLO models simply
with Ultralytics HUB

Learn more

Prompt enrichment is the process of automatically or semi-automatically enhancing a user's initial input prompt before it is processed by an Artificial Intelligence (AI) model, especially Large Language Models (LLMs). The primary objective is to improve the quality, relevance, and specificity of the AI's output by adding relevant contextual information, clarifying potential ambiguities, setting constraints, or including specific details. This technique refines the interaction between users and AI systems, making prompts more effective without necessitating deep expertise in prompt engineering from the user, thus improving the overall user experience (UX).

How Prompt Enrichment Works

The enrichment process typically begins by analyzing the original user prompt. Based on this analysis, the system leverages additional information sources or predefined rules to augment the prompt. This might involve accessing user interaction history, retrieving pertinent documents from a knowledge base, incorporating the context of the ongoing conversation, or applying specific formatting instructions required by the model. For example, a simple prompt like "Summarize the latest Ultralytics developments" could be enriched to specify "Summarize the key features and performance improvements of Ultralytics YOLOv11 compared to YOLOv8, focusing on object detection tasks." Techniques like Retrieval-Augmented Generation (RAG) are commonly used, where the system fetches relevant data snippets (e.g., from Ultralytics Docs) and incorporates them into the prompt's context window before sending it to the LLM. This ensures the model has the necessary background to generate a comprehensive and accurate response.

Applications and Examples

Prompt enrichment is valuable across numerous AI-driven applications, enhancing interaction quality and task performance:

  • Customer Support Chatbots: A customer asking "What's the status of my order?" can have their prompt enriched with their user ID or recent order number retrieved from a Customer Relationship Management (CRM) system via API integration. The enriched prompt allows the chatbot to provide a specific update immediately, rather than asking follow-up clarifying questions.
  • Virtual Assistants for Personalization: When a user asks a virtual assistant like Google Assistant or Alexa to "Play some music," the prompt can be enriched based on the user's listening history, preferred genres, time of day, or even current activity detected via connected devices, leading to a more personalized music selection.
  • Content Creation Tools: A creative writing assistant using text generation might receive a vague prompt like "Write a story." Prompt enrichment could add details based on previous interactions, such as "Write a short science fiction story set in a dystopian future, featuring a rebellious protagonist," making the output more aligned with the user's likely interests.
  • Semantic Search Systems: When searching internal company documents, a query like "Find reports on Q4 performance" can be enriched with the user's department, role, and access permissions to retrieve the most relevant and permissible documents from a vast data lake.

Relevance in Computer Vision

While prompt enrichment is most commonly associated with LLMs and Natural Language Understanding (NLU), its principles are becoming relevant in Computer Vision (CV). Traditional CV tasks like standard object detection using models like Ultralytics YOLO typically rely on image inputs rather than complex text prompts. However, newer multi-modal models and promptable vision systems, such as CLIP, YOLO-World, and YOLOE, accept text or image prompts to guide tasks like zero-shot detection. For these models, enriching a simple text prompt (e.g., "detect vehicles") with more context (e.g., "detect only emergency vehicles like ambulances and fire trucks in this traffic camera feed") could significantly improve performance and specificity. Platforms like Ultralytics HUB could potentially integrate such techniques to simplify user interaction when defining complex vision tasks or analyzing results, representing an area of ongoing AI research and development aimed at improving AI safety and usability across domains.

Read all