RAG Based AI Assitant

🧠 Project Overview: RAG-Based AI Assistant using ChromaDB and Groq

In an era where Large Language Models (LLMs) are transforming industries, the challenge of hallucination—where an AI model generates inaccurate or fabricated information—has become increasingly significant. This project, titled “RAG-Based AI Assistant”, was designed to address that problem by building a Retrieval-Augmented Generation (RAG) system. It ensures that every response generated by the AI is grounded in the user-provided documents, creating a factual, reliable, and context-aware conversational assistant.

This RAG pipeline integrates LangChain, ChromaDB, and Groq, combining modern advancements in natural language understanding with vector-based retrieval systems. The assistant can load .txt or .pdf documents from a local folder, embed them using SentenceTransformers, store their semantic representations in ChromaDB, and then dynamically retrieve relevant information in real-time when a user asks a question. The Groq API powers the language model inference, offering lightning-fast response times while maintaining high-quality reasoning.

🔍 Problem Statement

Traditional LLMs, though incredibly powerful, suffer from hallucination—they may provide confident but incorrect answers, especially when dealing with domain-specific or proprietary data. Businesses and researchers often need an AI system that can provide accurate, document-verified responses instead of relying on general model training.

The RAG-Based AI Assistant bridges this gap. It connects external knowledge sources (in this case, local text or PDF documents) with the generative capabilities of an LLM, ensuring that the final answer is based only on the provided information, not on external or imagined data.

⚙️ How It Works

Document Loading and Preprocessing:
The system first scans the /data folder for .txt or .pdf files. Each document is read, cleaned, and prepared for embedding.
Text Chunking:
Using LangChain’s RecursiveCharacterTextSplitter, each document is split into smaller, overlapping text chunks (e.g., 1000 characters with 200-character overlaps). This enables fine-grained search and contextual retrieval.
Semantic Embedding:
Each chunk is transformed into a high-dimensional vector using the sentence-transformers/all-MiniLM-L6-v2 model. These embeddings capture semantic meaning, enabling similarity search beyond exact keyword matches.
Vector Storage with ChromaDB:
All embeddings, along with metadata such as filenames, are stored in a persistent ChromaDB vector database. This allows fast and efficient similarity-based retrieval.
Query Processing and Retrieval:
When the user enters a question, it is embedded using the same model and compared against stored document vectors to find the top most relevant chunks.
Answer Generation (Groq Model):
The retrieved chunks are then passed into a Groq-powered LLM. The model is instructed via a strict Prompt Template to only use the retrieved information and refrain from hallucinating or adding external content.
Final Response:
The assistant produces an accurate, context-aware answer, directly grounded in the uploaded documents.

🚀 Key Features

Strictly Contextual Responses: Answers only from your uploaded documents.
Supports Multiple File Formats: Works with .txt and .pdf files.
Vector-Based Retrieval: Powered by ChromaDB, enabling semantic similarity search.
High-Performance Inference: Utilizes Groq API for fast, efficient generation.
No Hallucinations: The model is constrained to respond strictly from retrieved data.
Easy Integration: Simple, modular design with clear folder separation (src/, data/).

🧩 Real-World Applications

Research Assistance: Load academic papers and ask direct questions about them.
Corporate Knowledge Systems: Train the model on company documents, manuals, and SOPs.
Education & E-Learning: Provide factual answers to students using uploaded notes.
Healthcare Documentation: Retrieve precise information from medical reports or case files.
Legal Document Analysis: Query through contracts and judgments for specific clauses.

🧰 Tech Stack

Python 3.10+ — Base programming language
LangChain — Framework for managing retrieval and prompting pipelines
ChromaDB — Local vector database for document embeddings
SentenceTransformers (all-MiniLM-L6-v2) — Embedding model for semantic similarity
Groq API — High-speed inference engine for LLM-based responses
dotenv — Environment variable management
pypdf — PDF text extraction

🌟 Why This Project Stands Out

Unlike typical chatbot or GPT-like applications, this project focuses on accuracy and trust. Every answer is traceable back to a source document. The combination of ChromaDB’s vector search and Groq’s speed ensures that the assistant is not just intelligent but also responsible and grounded.

Its modular design allows developers to extend functionality — you can plug in more advanced LLMs, integrate Streamlit UI for user interaction, or deploy it as a backend API for enterprise applications.

Ultimately, this project demonstrates how Retrieval-Augmented Generation can transform generic LLMs into domain-specific experts capable of providing factual, explainable, and verifiable answers. It’s a practical example of how the next generation of AI systems will blend retrieval and generation for reliable intelligence.

RAG Based AI Assitant

RAG Based AI Assitant

Table of contents

🧠 Project Overview: RAG-Based AI Assistant using ChromaDB and Groq

🔍 Problem Statement

⚙️ How It Works

🚀 Key Features

🧩 Real-World Applications

🧰 Tech Stack

🌟 Why This Project Stands Out

Table of contents

Files

Code

Code