Foundational Agentic RAG

🧠 Foundational Agentic RAG

GitHub Repository: https://github.com/pratham-jaiswal/agentic-rag

Abstract

Foundational Agentic RAG is a template-level framework for building secure, document-grounded AI assistants using Retrieval-Augmented Generation (RAG) and agentic reasoning. Designed with scalability and modularity in mind, it enables individuals or organizations to index internal knowledge (PDFs only) and interact with it through a conversational UI. The system is powered by OpenAI’s GPT models, FAISS vector search, LangChain/LangGraph for agent behavior, and a customizable Vite + React frontend with Clerk authentication.

🔑 Key Features

PDF-Based Knowledge Indexing: Processes both text and images from PDFs using OpenAI embeddings and stores them in FAISS.
Agentic RAG: Implements a ReAct-style agent capable of reasoning and retrieving context before generating responses.
FastAPI Backend: Modular, performant backend architecture with customizable system prompts and environment variables - such as keys, model and directory of choice.
Secure Frontend: React + Vite UI with authentication via Clerk.
Multi-Document Support: Handles multiple PDF documents and retrieves only the most relevant context.
OpenAI Integration: Fully integrated with configurable models for both embeddings and completions.
Deploy-Ready Structure: Easily adaptable for individual, organizational, or general-purpose deployments.

💡 Use Cases

Internal Knowledge Assistants for companies to query HR manuals, policy docs, SOPs, etc.
Academic Support Tools where students or educators chat with research papers or course material.
Client-Facing Chatbots for document-based FAQs, whitepapers, or brochures.
Developer Templates for building and deploying agentic RAG systems quickly.

📘 Introduction

Large Language Models (LLMs) are powerful, but they don’t "know" your private documents by default. Retrieval-Augmented Generation (RAG) fixes that, and Foundational Agentic RAG goes a step further by embedding an agent that can reason through intermediate steps to better answer complex queries.

Instead of just stuffing chunks of documents into prompts, this framework enables a more thoughtful and context-aware interaction with your PDFs, all while being adaptable to your organization’s branding, models, or access controls.

⚙️ Methodology

🧩 1. Source Indexing (Owner-Side)

Owner stores PDFs in a specified directory (SOURCE_DOC_DIR).
source_indexing.py or run_indexing.sh extracts content and creates a vector index using OpenAI embeddings.
Extracted images are also saved separately (IMAGE_DIR).
FAISS is used for similarity search and stored locally (FAISS_INDEX_DIR).

🤖 2. Agentic Backend (FastAPI)

LangChain + LangGraph agent built on the ReAct framework.
Receives user input via API → retrieves relevant chunks → generates grounded responses.
Backend configures model behavior, top-k results (TOP_K), and system prompt customization.

💬 3. Frontend Chat UI (React + Vite)

Clean interface for user interactions.
Clerk Auth SDK integrated for access control, configurable to support email, Google, SSO, or any identity provider supported by Clerk.
Configurable organization name, theme colors, favicon, and metadata.

Screenshot

✨ Key Takeaways

RAG + ReAct = Smarter Responses: The system doesn’t just retrieve; it reasons using an agent loop.
No User Uploads: Only the owner uploads and indexes documents, ensuring content integrity and security.
Plug-and-Play Flexibility: Easy to customize names, themes, prompts, and model choices.
Modular Setup: Separate backend and frontend systems allow independent scaling or hosting.
Production-Ready: With Clerk auth, environment configs, and extensibility in mind.

🧾 Conclusion

Foundational Agentic RAG is more than just a chatbot over documents, it's a ready-to-deploy framework that brings the power of LLMs and structured retrieval to real-world use cases. Whether you're an individual exploring agent-based AI, or an organization wanting a secure, private assistant over internal documentation, this project gives you the tools to build fast, think smart, and scale confidently.

💬 FAQs

1. How does the agentic approach improve over basic RAG implementations?

Traditional RAG systems retrieve information and immediately use it to answer queries. In contrast, this project implements an agentic RAG using a ReAct-style loop, enabling the agent to reason through multiple steps, akin to a chain-of-thought, before responding. This leads to more accurate, context-sensitive answers.

2. What are the trade-offs of restricting uploads to the owner only?

This decision enhances security, consistency, and content control. Since only the owner uploads documents and runs the indexing script, the application is not meant for end-user file uploads. This suits internal tools or static knowledge bases, but may not fit customer-facing use cases requiring personalized document interactions.

3. What are the privacy and compliance considerations for deploying this in regulated industries?

Since all documents are pre-indexed and never leave the owner’s environment unless explicitly routed through APIs like OpenAI, the system can be deployed in controlled, private infrastructure. Clerk's configurable authentication supports compliance needs such as enterprise SSO, audit trails, and identity verification protocols.

4. In what scenarios might a fine-tuned model outperform a RAG-based system like this?

Fine-tuned models may outperform in extremely domain-specific or stylistically rigid tasks (e.g., legal or medical advice) where retrieval context isn’t enough. However, for dynamic or mixed-topic knowledge bases, a RAG system remains more flexible and cost-effective.

5. How scalable is this setup for large-scale deployments or enterprise use?

The system is modular and horizontally scalable. The vector index (FAISS) can be hosted on higher-end storage or converted into a persistent DB-backed vector store. LangChain supports multiple backends, and the React + FastAPI architecture supports containerization, CI/CD, and horizontal scaling. That said, for extremely large corpora or real-time streaming retrieval, some engineering work (like using distributed vector databases or caching layers) would be needed.

6. How customizable is the system prompt for tailoring agent behavior to different domains?

Highly customizable. The system prompt and few-shot examples are directly editable in the backend (app.py, in rag_agent_node). Role-based instruction styles can also be used to shape tone, verbosity, or response structure.

7. What parts of the UI can be customized without touching the backend?

The frontend allows visual and brand customization through:

src/colors.css (theme colors)
index.html and manifest.json (name, metadata)
Icons (favicon.ico, logo192.png, logo512.png)
Organization/App name in App.jsx

This allows teams to re-skin and rebrand the app without modifying functionality.

8. Can different OpenAI models (or other providers) be easily swapped in the backend?

Model swapping (within OpenAI) is easy, just change OPENAI_MODEL and/or OPENAI_EMBEDDING_MODEL in the .env file.
Changing the provider (e.g., to Gemini, Anthropic, or local models) requires moderate effort: you'd need to update the backend logic in backend/entities/llm.py with LangChain-compatible wrappers for the new provider - although it must support Image Input if your documents contain images.

9. How easy is it to rebrand and deploy this for different organizations with minimal changes?

Very easy. Environment variables handle most deployment-level customization:

Organization name: ORG_NAME
Agent name: AGENT_NAME
Model choices: OPENAI_MODEL, EMBEDDING_MODEL
Source/config paths: SOURCE_DOC_DIR, IMAGE_DIR, FAISS_INDEX_DIR
Auth keys: CLERK_PEM_PUBLIC_KEY, CLERK_PUBLISHABLE_KEY
UI rebranding handled through App.jsx, favicon, manifest, and color definitions (src/colors.css)

This makes the project a plug-and-play foundation for any brand/organisation.

10. How flexible is the authentication layer for integrating organization-specific login methods?

Extremely flexible. Clerk supports a wide range of sign-in methods including email/password, magic link, Google, Outlook, and SSO providers. Configuration is done via the Clerk dashboard — no code changes are needed to support different auth methods, making it ideal for deployments across diverse environments.

🙏 Acknowledgement

Special thanks to the open-source community behind:

LangChain and LangGraph for enabling agentic reasoning workflows.
OpenAI for LLM and embedding APIs.
FAISS for efficient vector similarity search.
Clerk for seamless authentication tools.
React and Vite ecosystems for frontend performance and flexibility.