This project presents a Retrieval-Augmented Generation (RAG) AI assistant designed to answer questions using external documents. The system leverages vector embeddings and a local dataset of documents sourced from Wikipedia to provide accurate and contextually relevant answers.
Key contributions:
Implementation of a RAG pipeline using Python.
Integration of a vector database for efficient document retrieval.
Demonstration of performance using sample queries on Wikipedia-sourced data.
The code and dataset are publicly available in the GitHub repository.
Introduction
This is a solo work i did by my self.Traditional language models generate text based solely on learned patterns, which can lead to outdated or incorrect responses. Retrieval-Augmented Generation (RAG) enhances the accuracy of AI responses by combining a language model with external knowledge sources.
This project implements a RAG assistant that:
Retrieves relevant documents from a dataset.
Generates answers informed by the retrieved documents.
Provides a reproducible framework for integrating new datasets.
The dataset used consists of Wikipedia pages related to artificial intelligence, machine learning, and natural language processing.
The goal of the experiments is to evaluate the RAG assistant's ability to:
Retrieve relevant documents from the dataset.
Generate accurate and contextually appropriate answers.
Experimental Setup
Dataset: Text files from Wikipedia pages stored in src/data/example_docs/
Vector database: FAISS embeddings using OpenAI embeddings
LLM model: OpenAI GPT-based model
Queries: Questions related to AI, ML, and NLP
Results
This section presents the performance and observations from the experiments conducted with the RAG assistant using the example dataset. Both quantitative and qualitative analyses are provided to assess its retrieval and generation capabilities.
Quantitative Analysis
We tested the RAG system on a set of 10 sample queries covering different topics present in the dataset. The results are summarized below:
Observation: The system successfully retrieved relevant documents for every query, and most answers were correct and contextually appropriate.
Qualitative Analysis
Here are examples of the RAG assistant’s behavior:
Successful retrieval and answer generation
Query: "What is artificial intelligence?"
Retrieved document:Wikipedia article on Artificial Intelligence
Generated answer: "Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think and learn like humans."
Outcome: Fully correct
Partial success
Query: "Explain reinforcement learning."
Retrieved documents: Several AI-related articles
Generated answer: "Reinforcement learning is a type of machine learning where agents learn to make decisions by receiving rewards or penalties."
Outcome: Correct but missing details on algorithms and practical examples
Limitation / failure
Query: "Who is the CEO of OpenAI?"
Retrieved documents: None directly covering the CEO
Generated answer: "I am not sure about the current CEO of OpenAI."
Screenshot examples: (Insert images of terminal outputs or notebook outputs showing query, retrieved docs, and generated answers.)
Key Insights
The system performs well when relevant documents exist in the dataset.
Retrieval significantly improves the accuracy and specificity of LLM-generated answers compared to using the LLM alone.
Dataset coverage is critical: missing information leads to partial or generic answers.
For broader applicability, expanding the dataset with more documents will directly improve performance.
System demonstrates potential for real-world RAG applications in knowledge-intensive tasks.
Optional visualizations:
Accuracy chart: bar plot comparing correct, partial, and failed answers.
Retrieval success rate plot per query.
(Include charts if generated from your notebook.)
Conclusion
This work demonstrates the development and evaluation of a Retrieval-Augmented Generation (RAG) assistant capable of answering queries by combining document retrieval with large language model (LLM) generation.
Key takeaways:
Effectiveness of RAG: Integrating document retrieval with an LLM significantly improves answer accuracy and relevance compared to using an LLM alone.
Dependence on dataset quality: The system performs well when relevant documents are present but may produce generic or incomplete answers if the dataset is limited.
Scalability and adaptability: The approach is flexible and can be extended to larger datasets or domain-specific knowledge bases, enhancing applicability in diverse real-world tasks.
Future improvements: Expanding and curating the dataset, incorporating more advanced retrieval techniques, and fine-tuning the LLM could further improve performance.
In conclusion, this project highlights the potential of RAG systems for knowledge-intensive applications and provides a foundation for further exploration in building efficient, accurate, and context-aware AI assistants.