This project implements a Retrieval-Augmented Generation (RAG) AI Assistant that enhances question answering by combining document retrieval with large language models (LLMs). The system ingests PDF documents of 12th NCERT chemistry book, extracts and chunks text, embeds it into a vector database, and retrieves relevant content to provide accurate, context-aware answers.
Document Loading β PDF files are parsed and converted into text.
Text Chunking β Content is split into manageable word-based chunks.
Embedding Generation β Sentence-transformer (all-MiniLM-L6-v2) converts chunks into dense vector embeddings.
Vector Database β ChromaDB stores and retrieves embeddings efficiently.
Retrieval β On user queries, the most relevant document chunks are fetched.
LLM Integration β Groqβs LLaMA model generates natural language answers, guided by retrieved context.
Response Generation β The system returns concise, source-cited answers through a CLI interface.
The RAG Assistant successfully answers user queries using knowledge extracted from uploaded PDFs(12th standard NCERT chemistry book). It retrieves contextually relevant passages, generates clear answers, and cites the original sources. This ensures reliable, grounded responses instead of hallucinations, making it a useful tool for learning, research, and automated Q&A tasks.