Abstract
This project presents the design and implementation of a Retrieval-Augmented Generation (RAG) based Research Assistant that leverages Large Language Models (LLMs) and vector databases to provide accurate, context-aware responses to research-related queries. Unlike traditional question answering systems that rely solely on pretrained knowledge, the proposed assistant integrates document retrieval with generative AI, ensuring that answers are grounded in the provided research documents. The system utilizes document loading, text chunking, embedding generation, and semantic similarity search to retrieve relevant research content, which is then synthesized into comprehensive answers by an LLM. The solution demonstrates how RAG can enhance research workflows by enabling users to efficiently query and summarize large volumes of unstructured text.
Methodology
1. Data Collection and Preprocessing
- Research publications in .txt format were loaded using a document loader.
- Each document was split into overlapping text chunks using a recursive character text splitter to preserve semantic meaning while maintaining manageable input sizes for embeddings.
2. Embedding and Storage
- Each text chunk was transformed into a dense vector representation using the HuggingFace MiniLM embedding model.
- The embeddings, along with their metadata (title, chunk ID), were stored in a ChromaDB vector database for efficient similarity search.
3. Retrieval
- When a user asks a question, the query is embedded into a vector representation using the same embedding model.
- A similarity search is performed in ChromaDB to retrieve the most relevant chunks of text from the research publications.
4. Generation
- The retrieved context is combined with the user’s query and passed into a Large Language Model (ChatGroq – Llama 3.3 70B) using a structured prompt template.
- The LLM synthesizes an answer that is both contextually relevant and grounded in the research documents.
5. Evaluation
- The system outputs both the AI-generated answer and the list of document chunks (sources) used to generate the response, ensuring transparency and traceability.
Results
- The system successfully integrated document retrieval with LLM-based text generation, creating a research assistant capable of answering queries with supporting sources.
- On queries such as “Types of learning in AI” and “Examples of neural networks”, the assistant produced comprehensive, contextually grounded answers along with relevant references from the research documents.
- The use of embeddings and vector search improved retrieval accuracy, ensuring that answers were not hallucinated but anchored in actual research texts.