RAG Wiki Assistant is a Retrieval-Augmented Generation (RAG) system that combines LangChain, FAISS, and Gemini 2.5 Flash to answer user queries using a Wikipedia-based knowledge base.
It dynamically retrieves relevant information and generates context-rich, grounded responses using verified content rather than model memory.
Key features include:
This publication outlines the assistantβs purpose, technical architecture, and potential applications in education, research, and AI development.
The RAG Wiki Assistant is designed to generate accurate, fact-based, and context-aware responses by combining retrieval and generation techniques.
Instead of relying on model hallucination, it grounds its outputs in Wikipedia content, making it suitable for:
The assistant ensures every response is relevant, verifiable, and contextually grounded in the retrieved source material.
Traditional AI assistants tend to produce hallucinated or unverifiable information.
This project solves that issue through retrieval-augmented prompting, ensuring every answer is backed by real, sourced data.
In short, RAG Wiki Assistant bridges the gap between retrieval precision and generative fluency, making it a trusted AI companion for learning and discovery.
| Component | Purpose |
|---|---|
| LangChain | Orchestrates the retrieval β generation pipeline |
| FAISS | Provides high-speed vector-based similarity search |
| Gemini 2.5 Flash | Generates factual, context-aware answers |
| WikipediaLoader | Fetches Wikipedia content dynamically |
| RecursiveCharacterTextSplitter | Splits long texts into overlapping chunks |
| dotenv & Logging | Ensures secure configuration and traceable operations |
.env file.This modular design makes the assistant robust, scalable, and transparent.
Absolutely. The assistant is designed for quick setup and easy customization.
.env variables.# Clone the repository git clone https://github.com/Ramee4sure/RAG-Wikipedia-Assitant.git cd RAG-Wikipedia-Assitant # Create a virtual environment (optional) python -m venv .venv .venv\Scripts\activate # Windows source .venv/bin/activate # Mac/Linux # Install dependencies pip install -r requirements.txt # Run the assistant python src/app.py
Full instructions, configuration examples, and environment templates are available in the
GitHub Repository.
from rag_chain.rag import start_bot # Initialize and start the assistant if __name__ == "__main__": start_bot()
A dynamic LangChain PromptTemplate ensures that the LLM generates answers only from retrieved context.
Query:
βWhat is Transfer Learning in Machine Learning?β
Response:
βTransfer Learning is a machine learning technique where a model trained on one task is reused as a starting point for another related task. (Source: Wikipedia)β
Out-of-Scope Query:
βI'm sorry, but this question cannot be answered with the provided documents.β
| Criterion | Status |
|---|---|
| Clear Purpose | β Defined in Abstract & Purpose |
| Value/Impact | β Grounded, reliable, and reproducible |
| Technical Quality | β Modular and verifiable pipeline |
| Usability | β Clear setup and documentation |
| Validation | β Wikipedia-supported factual grounding |
| Reproducibility | β Fully open-source and configurable |
RAG-Wikipedia-Assistant/ βββ src/ β βββ scraper/ # Wikipedia text scraper β βββ rag_chain/ # RAG pipeline (Gemini + FAISS) β βββ app.py # Entry point with .env creation β βββ wikipedia_pages/ # Stored Wikipedia documents βββ vectorstore/ # FAISS vector database βββ .env_example # Configuration template βββ requirements.txt # Dependency list βββ README.md # Documentation
| Name | Role |
|---|---|
| Ramadan | GitHub & Documentation |
| Manas | Wikipedia Scraper |
| Mohammad | RAG Chain Development |
| Akinpeumi | Integrations & Testing |
MIT License
Permission is granted, free of charge, to use, copy, modify, and distribute this work with attribution.
Primary Author: Manas Gaurkar
π§ Email: manas.gaurkar.dev@gmail.com
π GitHub: Ramee4sure
π Project: RAG-Wikipedia-Assitant