Author: MOKEMI TCHAKOUTE BAUTREL Duval
Date: March 2026
DocuChat-RAG is an intelligent conversational assistant based on search augmented generation (RAG), designed to answer questions using exclusively your personal documents. Unlike general chatbots that can produce erroneous or speculative answers, this assistant is constrained to respond only from indexed documents, making it ideally suited for domain-specific applications or those subject to compliance requirements.
The system enhances reliability through:
Users can interact with the chatbot via a simple and intuitive command-line interface (CLI).
This publication provides comprehensive instructions on the architecture, operation, and management of the RAG-based chatbot system. It details all the essential components:
In addition to describing the architecture, this document includes instructions for running the application, configuration options, and a test dataset to validate system behavior. It also outlines current limitations and areas for future improvement.
The dataset used for this application is a collection of 7 carefully selected text documents , covering a variety of current topics:
| File | Thematic | Main content |
|---|---|---|
artificial_intelligence.txt | Artificial intelligence | Principles of AI, Machine Learning, Deep Learning, NLP, AI Ethics |
biotechnology.txt | Biotechnology | CRISPR, Synthetic biology, Personalized medicine, Regenerative medicine, Bioethics |
climate_science.txt | Climate Sciences | Climate change, Greenhouse effect, Ecological impacts, Renewable energies, Adaptation |
quantum_computing.txt | Quantum Computing | Quantum principles (superposition, entanglement), Quantum algorithms, Quantum hardware, Applications |
space_exploration.txt | Space Exploration | Solar system, Martian exploration, International Space Station, Future of space travel, Exoplanets |
sustainable_energy.txt | Sustainable Energy | Solar energy, Wind energy, Energy storage, Smart grids, Green hydrogen |
sample_documents.txt | Data Science | Data science methodology, Feature engineering, Model evaluation, MLOps, Visualization |
Together, these files form an encyclopedic and multidisciplinary knowledge base, enabling the assistant to answer a variety of questions in cutting-edge scientific and technological fields.
| Component | Technology | Role |
|---|---|---|
| Language | Python 3.9+ | Primary programming language |
| Vector Database | ChromaDB | Semantic storage and retrieval |
| Embeddings | Sentence Transformers (all-MiniLM-L6-v2) | Text to vector conversion (384 dimensions) |
| LLM | OpenAI / Groq / Gemini | Response generation |
| Framework | LangChain | RAG orchestration and memory management |
| Interface | CLI (Python) | Command-line interaction |
βββββββββββββββββββββββββββββββ
β USER β
β Ask a Question β
ββββββββββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββ
β QUERY EMBEDDING β
β Sentence Transformers β
β (all-MiniLM-L6-v2) β
ββββββββββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββ
β VECTOR SEARCH β
β ChromaDB β
β Top-K Semantic Retrieval β
ββββββββββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββ
β RETRIEVED DOCUMENTS β
β Relevant Chunks β
β (chunk_size=500, overlap=50)β
ββββββββββββββββ¬βββββββββββββββ
β
ββββββββββββββββββ΄βββββββββββββββββ
βΌ βΌ
βββββββββββββββββββββββββ βββββββββββββββββββββββββββ
β CONVERSATION MEMORY β β PROMPT β
β β β CONSTRUCTION β
β Windowed Memory (5) β β Context + Question β
β JSON Chat History β β Memory + Instructions β
βββββββββββββββ¬ββββββββββ ββββββββββββββββ¬βββββββββββ
β β
ββββββββββββββββ¬ββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββ
β LLM β
β OpenAI / Groq / Geminiβ
β Grounded Generation β
βββββββββββββββ¬ββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β FINAL ANSWER β
β Grounded Response β
βββββββββββββββββββββββ
| Component | Description |
|---|---|
| RAGAssistant | Main class that orchestrates the end-to-end flow of requests |
| VectorDB | Handles ingestion, chunking, and semantic searching via ChromaDB |
| WindowedFileChatHistory | Custom memory: window of the last 5 exchanges + JSON persistence |
| Multi-LLM Support | Integration of multi-vendor models (OpenAI, Groq, Gemini) |
| Prompt Template | Structured template with hallucination prevention constraints |
| CLI Interface | Command-line interface for interactive testing |
When the application starts, the system automatically loads all .txt documents present in the data/ directory.
Ingestion pipeline:
RecursiveCharacterTextSplitter (chunk_size=500, overlap=50)all-MiniLM-L6-v2
Document search uses cosine similarity with HNSW (Hierarchical Navigable Small World) indexing for instant search.
| Setting | Value | Role |
|---|---|---|
| top_k | 3 | Number of most relevant chunks to return |
| distance_threshold | 0.5 | Maximum distance threshold for validation |
The application implements custom memory via WindowedFileChatHistory:
class WindowedFileChatHistory(FileChatMessageHistory): """Retourne seulement les 5 derniers Γ©changes au LLM, mais persiste tout l'historique dans un fichier JSON.""" def __init__(self, file_path: str, k: int = 5): super().__init__(file_path) self.k = k @property def messages(self): all_messages = super().messages # Garde les derniers k Γ©changes (k*2 messages) return all_messages[-(self.k * 2):]
Benefits:
β Limited context = constant token cost
β Complete persistent history (JSON)
β Automatic reset for each test session
The application automatically detects the available API key from among:
OpenAI: GPT-4o, GPT-3.5-Turbo
Groq: Llama-3, Mixtral
Google: Gemini Pro
The prompt system includes strict constraints:
Constraints:
All parameters are configurable via environment variables (.env):
LLM Supplier Selection
Distance thresholds and top_k
Memory window size (k)
Logging levels
Data flow
Document ingestion flow
βββββββββββββββββββββββββ
β data/ directory β
β Raw Text Documents β
β (.txt files) β
βββββββββββββ¬ββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β Document Loading β
β UTF-8 File Reader β
βββββββββββββ¬ββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β Text Chunking β
β chunk_size = 500 β
β overlap = 50 β
βββββββββββββ¬ββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β Embedding Generation β
β SentenceTransformers β
β 384-dim vectors β
βββββββββββββ¬ββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β ChromaDB β
β Vector Store β
β Persistent Storage β
βββββββββββββββββββββββββ
Request processing flow
βββββββββββββββββββββββββ
β User Query β
βββββββββββββ¬ββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β Query Embedding β
βββββββββββββ¬ββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β Vector Search β
β ChromaDB β
β top_k = 3 β
βββββββββββββ¬ββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β Retrieved Context β
βββββββββββββ¬ββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββ
β Prompt Assembly β
β (Context + User Question) β
βββββββββββββ¬ββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββ
β LLM Generation β
β (Grounded Answer)β
ββββββββββββ¬ββββββββ
βΌ
ββββββββββββββββββ
β Response β
ββββββββββββββββββ
Conversation Memory
(last 5 exchanges)
β
ββββββββββββββΊ Injected into Prompt
Example of interaction
Example 1: Artificial Intelligence

Example 2: Follow-up questions (memory test)

Example 3: Quantum Computing

Follow these steps to install and run DocuChat-RAG on your machine.
git clone https://github.com/Mcduval/docuchat-rag.git cd docuchat-rag
python -m venv venv source venv/bin/activate # Linux/Mac venv\Scripts\activate # Windows
pip install -r requirements.txt
cp .env.example .env
Edit .env with your key (OpenAI, Groq or Google)
Place your .txt files in the data/ folder
python src/app.py
Here are some examples of questions to ask the assistant, organized by topic:
| Question |
|---|
| "What is Deep Learning?" |
| βExplain Natural Language Processing to meβ |
| βWhat are the ethical challenges of AI?β |
| βWhat is the difference between supervised and unsupervised learning?β |
| Question |
|---|
| "How does CRISPR-Cas9 work?" |
| βWhat is personalized medicine?β |
| "Explain synthetic biology" |
| βWhat are the ethical considerations in biotechnology?β |
| Question |
|---|
| βWhat is the greenhouse effect?β |
| βHow does climate change affect ecosystems?β |
| βWhat are the main renewable energy solutions?β |
| βWhat is the difference between climate adaptation and mitigation?β |
| Question |
|---|
| βWhat is quantum superposition?β |
| "What is Shor's algorithm used for?" |
| βWhat are the applications of quantum computing?β |
| βWhat are qubits and how do they differ from classical bits?β |
| Question |
|---|
| βWhat are the goals of Mars exploration?β |
| βWhat is the International Space Station?β |
| βHow do scientists discover exoplanets?β |
| βWhat are the terrestrial planets in our solar system?β |
| Question |
|---|
| "How does solar energy work?" |
| "What is a smart grid?" |
| βWhy is energy storage important for renewable energy?β |
| βWhat is green hydrogen and how is it produced?β |
| Question |
|---|
| βExplain the CRISP-DM methodologyβ |
| βWhat is feature engineering?β |
| βHow do you evaluate a machine learning model?β |
| "What is MLOps?" |
| Scenario | Description |
|---|---|
| CRISPR Follow-up | βTell me about CRISPRβ β βWho discovered it?β β βWhat are its medical applications?β |
| Quantum Computing Follow-up | βExplain quantum computingβ β βWhat makes it different from classical computing?β β βWhat are the current hardware limitations?β |
| Limit | Description |
|---|---|
| π Limited format support | Only .txt files are supported. |
| π₯οΈ No web interface | CLI interface only (no Streamlit) |
| π₯ No multi-sessions | Only one conversation history at a time |
| βοΈ LLM cloud only | No support for local models (Ollama, LM Studio) |
| Functionality | Status |
|---|---|
| π Support PDF, Markdown, DOCX | π Planned |
| π₯οΈ Web interface (Streamlit) | π Planned |
| π₯ Multi-user sessions | π Planned |
| π» Local LLM (Ollama, LM Studio) | π Planned |
| π Hybrid search (BM25 + vector) | π Planned |
| π€ Export of conversations (PDF, TXT) | π Planned |
| Contact information |
GitHub: https://github.com/Mcduval
LinkedIn: https://www.linkedin.com/in/mokemi-tchakoute-bautrel-duval
Email: bautrelduval@gmail.com
License
This project is distributed under the MIT license