https://github.com/sarthak-cs/RAG_chatbot
Large Language Models (LLMs) such as Gemini are powerful but limited by their static training data. They may hallucinate or fail when asked about information outside their knowledge cutoff. To address this limitation, Retrieval-Augmented Generation (RAG) combines information retrieval with generative AI to produce more accurate, context-aware responses.
In this project, I built a console-based RAG chatbot that retrieves relevant information from a custom dataset and generates intelligent answers using Google Gemini 2.5 Pro. This work demonstrates the foundational concepts of agentic AI, where an AI system actively retrieves knowledge before reasoning and responding.
Document Ingestion
Custom text documents are loaded from local storage.
Text Chunking
Documents are split into overlapping chunks to preserve context while enabling efficient retrieval.
Embedding Generation
Each chunk is converted into vector embeddings using HuggingFace MiniLM.
Vector Storage
Embeddings are stored in a ChromaDB vector database for fast similarity search.
Query Processing
The chatbot operates in an interactive loop where user queries are processed in real time. Retrieved document chunks are explicitly injected into the prompt, ensuring that the model answers only using verified context.
Basic prompt-guarding is applied by instructing the model to rely strictly on retrieved information, reducing hallucinations.
This project demonstrates how Retrieval-Augmented Generation can significantly improve the reliability of AI assistants. By integrating semantic retrieval with generative models, the system provides grounded, context-aware responses and serves as a strong foundation for building more advanced agentic AI systems.