Agentic RAG Assistant: A Document-Grounded AI System with Strong Anti-Hallucination Controls

TL;DR / Abstract

This publication presents a single-agent Retrieval-Augmented Generation (RAG) AI assistant that answers user queries strictly based on a fixed set of curated text documents. Built using LangChain and Groq’s large language model, the system enforces retrieval-only answering, similarity-based gating, and polite refusal of out-of-scope queries to prevent hallucinations. The project demonstrates foundational principles of Agentic AI, controlled reasoning, and safe AI system design, and serves as a practical, reproducible reference implementation for document-grounded AI assistants.

1. Purpose and Objectives

Purpose:
The purpose of this project is to design and implement a safe, agentic RAG-based AI assistant that can answer questions only from provided documents while reliably refusing unsupported queries.

Objectives:

Demonstrate a single-agent Agentic RAG architecture
Implement strict anti-hallucination safeguards
Ensure answers are fully grounded in retrieved documents
Provide a usable interactive UI for demonstration
Maintain a clean, reproducible, open-source implementation

2. Problem Definition and Motivation

Large Language Models often generate confident but incorrect answers when asked questions outside their knowledge scope. This behavior—commonly known as hallucination—poses risks in educational, technical, and decision-support settings.

While Retrieval-Augmented Generation (RAG) mitigates this issue by grounding responses in external documents, many implementations still allow:

Partial guessing when retrieval fails
Silent fallback to model knowledge
Unclear boundaries of applicability

This project addresses these gaps by designing a strictly controlled Agentic RAG assistant that:

Treats retrieved documents as the only source of truth
Refuses to answer when relevant context is unavailable
Clearly communicates system limitations to the user

3. Target Audience and Use Cases

Target Audience:

AI/ML learners studying RAG and Agentic AI
Developers building document-based chatbots
Educators demonstrating safe AI system design
Evaluators and reviewers assessing RAG correctness

Use Cases:

Querying curated knowledge bases
Educational demonstrations of RAG behavior
Prototyping safe enterprise or academic assistants
Understanding hallucination mitigation techniques

4. Dataset Description and Scope

Data Source

The assistant uses static, author-provided text documents, each covering a distinct domain:

Artificial Intelligence
Biotechnology
Climate Science
Quantum Computing
Space Exploration
Sustainable Energy

These documents are stored as .txt files and manually curated to ensure clarity and relevance.

Scope and Boundaries

In scope: Questions answerable directly from the documents

Out of scope: Any query requiring external knowledge, current events, or personal opinions

The dataset is intentionally small and controlled to prioritize verifiability and safety over breadth.

5. System Architecture and Methodology

Architecture Diagram

High-Level Workflow

User submits a query
Documents are embedded and stored in a vector database
A retriever selects the most relevant document chunks
A single agent evaluates retrieved context
The LLM generates a response only from retrieved text
If context is insufficient, the system refuses to answer

Key Design Decisions

Single Agent: Appropriate for foundational Agentic AI concepts
No External Tools: Eliminates knowledge leakage
Similarity Thresholding: Prevents weak matches from influencing answers
Strict System Prompting: Enforces refusal behavior

6. Model and Embedding Selection Rationale

Choice of Groq LLM

The Groq LLM was selected due to:

Low-latency inference, making it suitable for interactive RAG systems
Strong performance in instruction-following and summarization
Ease of integration with LangChain for rapid prototyping

The goal of this project is not to benchmark state-of-the-art models, but to demonstrate correct and safe RAG system design, making Groq an appropriate engineering choice.

Embedding Model Selection

Sentence-transformer–based embeddings (e.g., MiniLM variants) were chosen because:

They provide efficient semantic similarity for short text chunks
They balance speed and retrieval quality
They are well-suited for small, static document corpora

This choice aligns with the project’s emphasis on retrieval correctness and reproducibility rather than large-scale retrieval.

7. Anti-Hallucination and Safety Measures

The system implements multiple layers of hallucination prevention:

Retrieval-Only Answering: The LLM never sees information beyond retrieved chunks
Similarity Gating: Low-confidence retrieval results trigger refusal
Explicit Refusal Policy: The assistant politely declines unsupported queries
No Web Access: Ensures deterministic and reproducible behavior

These safeguards ensure predictable and trustworthy outputs.

8. Implementation Details

Tools and Frameworks:

Python
LangChain (orchestration and retrieval)
Groq LLM (generation)
FAISS / Chroma (vector storage)
Streamlit (user interface)

Environment:

API keys loaded via environment variables
No secrets committed to version control

The complete implementation is available as an open-source repository.

9. Evaluation and Validation

Validation is performed through:

In-scope queries with known answers from documents
Out-of-scope queries to verify refusal behavior
Manual inspection of responses for grounding correctness

The system consistently:

Answers correctly when context exists
Refuses when context is missing
Avoids speculative or fabricated outputs

10. Limitations

Knowledge is limited strictly to provided documents
No multi-turn conversational memory
No automatic document updates or ingestion
Not designed for real-time or web-based knowledge

These limitations are intentional to preserve safety and clarity.

11. Real-World Applicability

This architecture is suitable for:

Educational assistants
Internal documentation chatbots
Compliance-sensitive environments
Early-stage RAG prototypes

It demonstrates how Agentic AI principles can be applied responsibly in real systems.

11. Key Contributions and Takeaways

Demonstrates correct RAG implementation, not superficial retrieval
Shows how agentic decision-making improves safety
Highlights practical hallucination mitigation techniques
Provides a clean, extensible reference implementation

12. License and Usage Rights

The project is released as an open-source software asset under the MIT License.

Users are permitted to:

Use the code for personal or commercial purposes
Modify and extend the implementation
Redistribute the software with attribution

The license promotes transparency, reuse, and educational adoption.

13. Access to Technical Assets

Code Repository:
https://github.com/dharamshiyash/agentic-rag-assistant

The repository includes full setup instructions, dependencies, and usage examples.

14. Future Directions

Multi-turn conversational RAG
Source citation in responses
Support for PDF and structured documents
Multi-agent extensions for advanced reasoning

15. Conclusion

This project demonstrates that safe, reliable AI assistants require more than powerful models. By combining Agentic AI principles with strict retrieval grounding, the system achieves predictable behavior, transparent limitations, and trustworthy outputs. The work serves as a strong foundational example of how RAG systems should be designed when correctness and safety matter.

Agentic RAG Assistant: A Document-Grounded AI System with Strong Anti-Hallucination Controls

Table of contents

Agentic RAG Assistant: A Document-Grounded AI System with Strong Anti-Hallucination Controls

TL;DR / Abstract

1. Purpose and Objectives

2. Problem Definition and Motivation

3. Target Audience and Use Cases

4. Dataset Description and Scope

Scope and Boundaries

5. System Architecture and Methodology

Architecture Diagram

High-Level Workflow

Key Design Decisions

6. Model and Embedding Selection Rationale

Choice of Groq LLM

Embedding Model Selection

7. Anti-Hallucination and Safety Measures

8. Implementation Details

Tools and Frameworks:

Environment:

9. Evaluation and Validation

10. Limitations

11. Real-World Applicability

11. Key Contributions and Takeaways

12. License and Usage Rights

13. Access to Technical Assets

14. Future Directions

15. Conclusion

Table of contents

Datasets

Datasets

Code

Code