Agentic RAG
What is Agentic RAG?
While RAG helps ground language models in factual, domain-specific data, basic implementations have limitations:
Common pitfalls of simple RAG:
- Single-step retrieval – If the initial retrieval is inaccurate, the model’s response suffers, with no way to refine the search.
- Query mismatch – Directly matching the user’s query can miss relevant results if phrasing differs (e.g., question vs. statement).
How Agentic RAG improves RAG
An agentic RAG overcomes these issues by:
- Iterating retrieval – Refining queries when initial results are off-target.
- Rewriting queries – Adjusting phrasing to better match stored documents.
By actively optimizing search strategies, an agentic RAG surfaces more relevant content than a single-step approach.
How to build a RAG Agent
By turning RAG into an agent-based approach, we can address these limitations more effectively.
Preparing the knowledge base
In this example, we load a dataset of documentation pages for various Hugging Face libraries, keeping only those related to the transformers
library. We then split this data into smaller chunks, which will form our searchable knowledge base.
At this point, docs_processed
is a structured collection of document chunks, ready to be used by a retriever tool.
Retriever Tool
Galadriel framework provides a RetrieverTool (visit Tools for more details) that uses semantic search (BM25 for simplicity) to fetch relevant passages. This tool will be called by the agent whenever it needs more information.
Note: You can replace BM25 with a vector-based retriever for better semantic matching.
Initializing the Agent
Check the code
The agent will attempt to solve the user’s query by:
- Reasoning through the user’s prompt step by step.
- Calling
retriever_tool
to gather relevant context from the knowledge base. - Potentially adjusting or refining the query if needed.
- Providing a final answer once enough context is gathered.
Bellow is a screenshot with a snippet of the logs from the execution:
As you can see, the agent was successful in selecting semantic relevant chunks of text and produced the final answer: The backward pass is typically slower than the forward pass during transformers model training.
Conclusion
RAG Agents provide a powerful way to ground your language model’s outputs on factual, domain-specific data. By:
- Rewriting queries – Agents can better match the style or content of the relevant documents.
- Iterating retrieval – Agents critique and refine their searches when initial results are insufficient.
- Controlling knowledge access – You decide what information the agent can retrieve, ensuring compliance and data security.
With this approach, you can address the pitfalls of single-step or query-mismatch retrieval and tap into more flexible, accurate, and controllable LLM interactions.