MEDIBOT — Multi‑Agent AI Medical Assistant With UI

Title Image.png

MEDIBOT — Multi‑Agent AI Medical Assistant with Streamlit UI

Abstract

In the age of intelligent healthcare, the demand for secure, explainable, and privacy-centric AI systems has grown exponentially. Traditional medical chatbots rely heavily on cloud-based APIs, raising concerns about data privacy, limited transparency, and inconsistent factual grounding. MEDIBOT redefines this paradigm through a multi-agent, retrieval-augmented, and domain-controlled medical conversational framework that ensures privacy, reliability, and clinical relevance.

The MEDIBOT Multi-Agent System integrates LangChain, FAISS, and Google Gemini LLMs to build a retrieval-augmented medical dialogue engine capable of dynamic reasoning, factual validation, and multimodal processing. By orchestrating specialized AI agents—spanning Diagnosis, Drug Information, Lifestyle Guidance, Medical Research Summarization, and Image Analysis—MEDIBOT provides context-aware, evidence-based, and ethically aligned healthcare assistance for professionals, students, and patients.

This project is an updated version of the following project:

https://app.readytensor.ai/publications/hIEm1nLm0QNG

The new version is an updated, production-ready release featuring a Streamlit web UI and several key enhancements:

Streamlit UI: Interactive dashboard with live agent progress, chat interface, and one-click response downloads.
Browse Files & Images: Upload PDFs, DOCX, or images for analysis using EasyOCR (text extraction) and BLIP (image captioning).
Enhanced Privacy: Runs locally with strict domain enforcement and no third-party API dependency.
Improved Orchestration: Structured agent flow (Diagnosis → Drug Info → Lifestyle → Research → Image) with guardrails and retries.
Retrieval-Augmented Generation (RAG): Contextual retrieval using FAISS and semantic similarity scoring.
Long-Term Memory: Conversation summarization for consistent and coherent dialogue.
Safety & Evaluation Layers: Domain filters, factual accuracy checks, and ethical alignment.
Usability Paths: CLI, Python API, and Streamlit—use in any workflow.
Artifacts & Logs: Organized outputs with audit-friendly metadata and logging.

Highlights

🩺 Medical-grade RAG pipeline with factual retrieval
🧠 Multi-agent orchestration across Diagnosis, Drug, Research, Lifestyle & Imaging
🔍 Semantic evaluation for accuracy and ethical alignment
💬 Conversational memory for context continuity
🖼️ File and image understanding with EasyOCR + BLIP
💡 Hybrid multimodal outputs with medical diagrams & text
🔐 Privacy-preserving, audit-ready architecture

🔁 Feature Matrix: Previous vs Streamlit Edition

Capability	Previous Project	New (Streamlit Edition)
UI	None	Streamlit dashboard with chat, file & image uploads
Inference	Cloud-based APIs	Gemini + Local FAISS hybrid (privacy-first)
Pipeline	Basic multi-agent	Reinforced 5-agent orchestration with guardrails
Privacy	Limited	Local domain-restricted, privacy-controlled
File Handling	None	Browse Files for document upload & analysis
Image Processing	None	Browse Images with EasyOCR + BLIP integration
Memory	Stateless	Persistent conversational summarization
Evaluation	None	Semantic similarity scoring & factual verification
Safety Layer	Minimal	Ethical filters & domain validation
Usage Modes	CLI only	CLI + Python API + Streamlit UI
Outputs	Text only	Text + images + structured data + visual insights

🧬 Introduction

In modern healthcare, artificial intelligence is rapidly evolving from passive information retrieval to interactive medical reasoning systems. However, many existing medical chatbots depend on external cloud APIs and unverified data sources, leading to potential privacy violations, latency, and unreliable medical responses.
To address these challenges, MEDIBOT introduces a multi-agent, retrieval-augmented, and privacy-focused medical conversational framework that operates under strict domain control and ethical safety constraints.

At its core, MEDIBOT is an intelligent healthcare assistant built using LangChain, FAISS, and Google Gemini large language models (LLMs). It employs a multi-agent orchestration pipeline to deliver accurate, explainable, and context-aware medical insights. Each agent is designed to handle a specific clinical sub-domain—ranging from diagnosis, drug information, and lifestyle guidance to medical research summarization and image-based interpretation.

⚙️ System Overview

The MEDIBOT system is divided into three major layers:

User Interaction Layer (Streamlit UI)

Provides a clean, interactive chat interface for both professionals and students.
Supports file uploads (PDF, DOCX) and image uploads (X-ray, prescription scans) through EasyOCR and BLIP, allowing the chatbot to analyze documents and images directly.
Displays real-time agent reasoning, chat history, and semantic evaluation feedback.
Agent Orchestration Layer (LangChain Multi-Agent Framework)

Implements domain-specific agents, including:

🧠 Diagnosis Agent – Identifies possible conditions or symptoms from patient input.
💊 Drug Information Agent – Retrieves dosage, side effects, and interactions from medical knowledge bases.
🧘 Lifestyle Agent – Provides personalized recommendations for diet, fitness, and wellness.
📚 Research Agent – Summarizes academic studies and medical literature.
🖼️ Image Agent – Interprets uploaded medical images via BLIP or visual models.

The controller agent manages conversation flow, delegates tasks, and ensures responses stay within medical boundaries.

Retrieval & Memory Layer (FAISS + Semantic Evaluation)

Uses FAISS vector storage to retrieve factual medical information and prior conversation context.
Embeddings are generated using HuggingFace Sentence Transformers (MiniLM-L6-v2) for efficient semantic search.
Implements semantic similarity scoring to validate factual accuracy and improve response relevance.
Maintains short- and long-term memory for coherent multi-turn dialogue.

🔐 Privacy and Data Handling

Unlike conventional chatbots, MEDIBOT is designed to run either locally or with controlled API calls, ensuring that sensitive medical data never leaves the user’s environment.

All uploaded files, embeddings, and logs are stored in local directories (./vectorstore/, ./outputs/).
API keys are loaded from a .env file and never logged or transmitted externally.
A domain enforcement layer filters out unrelated or unsafe prompts before processing.

This architecture enables offline or semi-offline operation, making it compliant with healthcare privacy standards and ideal for educational, diagnostic, and research contexts.

💡 Core Technologies

Component	Technology Used	Purpose
Framework	Streamlit	Interactive user interface for chat and visualization
Orchestration	LangChain Multi-Agent	Manages agent collaboration and task routing
Retrieval	FAISS	Fast vector similarity search for medical facts
Embeddings	HuggingFace Sentence Transformers	Semantic encoding of text for retrieval
LLM Backend	Google Gemini (via LangChain)	Core natural language reasoning and generation
Image & Text Extraction	BLIP, EasyOCR	Image captioning and text extraction from medical documents
Data Storage	Local Vectorstore + JSON Logs	Persistent storage for embeddings, chat history, and evaluations

🌍 Purpose and Vision

The overarching vision behind MEDIBOT is to make medical AI assistance both ethical and accessible.
By running locally and maintaining full transparency in its reasoning process, MEDIBOT ensures that healthcare AI remains trustworthy, explainable, and privacy-respecting.
Its modular design also allows it to scale to new medical domains or integrate with electronic health systems in the future.

“MEDIBOT transforms medical dialogue into a secure partnership between AI intelligence and human expertise—bridging data ethics with digital care.”

Methodology

🏗 System Architecture Overview

MEDIBOT is built on a layered, modular design that combines retrieval-based intelligence, domain-specific agent workflows, and memory-aware conversation control.

The system starts with a user query, which is first screened through a Medical Domain Filter. Based on the detected intent (e.g., symptoms, drugs, diet, research), the query is routed to the corresponding specialized agent. Each agent performs its own reasoning, retrieval, or generation task and returns a structured medical response.

At its core, the LangChain ConversationalRetrievalChain powers MEDIBOT’s reasoning loop, supported by:

FAISS Vector Store for semantic medical knowledge retrieval.

Google Gemini 2.0 Flash for controlled, evidence-based natural language generation.

ConversationBufferMemory and summary mechanisms for context retention.

🧩 Agents and Roles

Agent 1	Description
Diagnosis Agent	Fetches PubMed abstracts, builds contextual FAISS vector stores, and generates possible diagnoses with reasoning and test recommendations.
Drug Information Agent	Queries the OpenFDA API for accurate drug label data, dosage, contraindications, and safety warnings.
BMI Agent	Computes body mass index from user input and classifies the category with tailored lifestyle advice.
Lifestyle & Prevention Agent	Combines WGER API data with Gemini-driven reasoning to provide exercise, diet, and wellness plans.
Medical Research Agent	Retrieves latest medical studies from Europe PMC and summarizes findings using LLM-based synthesis.
Image Agent	Generates medical diagrams using Gemini 2.5 Image Model and Hugging Face Stable Diffusion XL as backup.

Detailed Responsibilities

Diagnosis Agent

Performs symptom extraction using regex and heuristic matching.
Searches PubMed via E-utilities and fetches recent abstracts.
Creates an on-the-fly FAISS vector store for symptom-specific retrieval.
Generates differential diagnoses, suggested tests, and clinical reasoning.

Drug Information Agent

Calls OpenFDA endpoints for official drug labeling data.
Displays indications, dosage, contraindications, and warnings.
Presents information in an easy-to-read, structured format.

BMI Agent

Extracts height and weight values from text.
Calculates BMI and classifies user into WHO categories.

3, Returns actionable diet and health advice.

Lifestyle & Prevention Agent

Uses Gemini to detect intent (exercise, diet, or mixed).
Fetches real exercises and food suggestions from WGER.
Generates motivating, structured wellness plans with emojis and personalization.

Medical Research Agent

Retrieves top scientific articles from Europe PMC API.
Summarizes study highlights, authors, journals, and publication years.
Generates a readable summary or detailed synthesis based on query type.

Image Agent

Creates labeled medical diagrams via Gemini image generation.
Falls back to Stable Diffusion XL if Gemini API quota is exceeded.
Saves outputs with timestamps for reproducibility and visual education.

🧠 Memory & Semantic Evaluation

Conversation Memory: Tracks 5 most recent dialogue turns using LangChain’s ConversationBufferMemory.
Summary Memory: Compresses older chats into a brief context string for continuity.
Semantic Evaluation: Measures factual alignment between retrieved context and generated answers using cosine similarity on HuggingFace embeddings.
Non-Medical Filter: Blocks unrelated, political, or humorous inputs to maintain medical domain integrity.

🛠 Tool Integration

Component	Purpose	Key Features
LangChain	Core orchestration	Conversation chains, memory, and prompt templates
FAISS	Vector database	Fast semantic retrieval for medical documents
Google Gemini	Text & image generation	Context-aware reasoning and multimodal outputs
Hugging Face Models	Backup image generation	High-resolution medical illustrations
OpenFDA / PubMed / Europe PMC APIs	Data sources	Verified, real-world medical datasets
Rich Console + CSV Logging	UI & tracking	Colorized terminal output with semantic similarity logs

The Architecture at a Glance

To help visualize how the system works internally, consider the following architecture diagram:

🔧 Project Structure Snapshot

Screenshot (514).png

Experiments

📄 Output (Diagnosis Example)

User: “I have fever, sore throat, and body ache.”
Agent Activated: Diagnosis Agent
Output:

Possible Diagnoses: Influenza, Streptococcal Pharyngitis, Viral Infection

Recommended Tests: Throat swab, CBC, PCR

Reasoning: Based on symptom cluster, most likely viral etiology with supportive management.

Sources: PubMed abstracts from 2022–2024

Screenshot (512).png

🧬 Output (Drug Information Example)

User: “Tell me about Insulin”
Agent Activated: Drug Info Agent
Output Summary:

Brand & Generic: Tylenol | Paracetamol

Indications: Fever, mild to moderate pain relief

Dosage: 500 mg every 6 hours (max 4g/day)

Contraindications: Liver disease, alcohol use

Source: FDA Drug Label Database

Screenshot (509).png

🩻 Output (Image Generation Example

User: “Generate a diagram of the human heart.”
Agent Activated: Image Agent
Output:

Generated via Gemini 2.5 Flash Preview

Backup available via Stable Diffusion XL
Screenshot (513).png

🔧 Installation & Quick-Start Guide

Follow these steps to get the MEDIBOT Multi-Agent System running on your local machine.

Clone the repository

git clone https://github.com/pamuarun/MEDIBOT-Multi-Agent-System.git
cd MEDIBOT-Multi-Agent-System

Create a virtual environment (optional but recommended)

python -m venv .venv
source .venv/bin/activate # on macOS/Linux
.venv\Scripts\activate # on Windows

Install Python dependencies

pip install -r requirements.txt

Add your API keys in the .env file
Example:
GOOGLE_API_KEY=your_gemini_api_key
HF_TOKEN=your_huggingface_token

Run the Streamlit app

streamlit run app.py

🩺 Conclusion

MEDIBOT stands as a testament to how cutting-edge AI innovation can coexist with medical ethics, privacy, and transparency.
Through the seamless fusion of LangChain’s RAG pipeline, multi-agent orchestration, and an intuitive Streamlit interface, it demonstrates that healthcare AI can be both powerful and principled.

This architecture not only ensures factual precision and domain safety but also redefines the standard for responsible AI in medicine—where every response respects the sanctity of data and the trust of its users.
By bridging intelligent automation with human-centered design, MEDIBOT lays the groundwork for a new era of ethical, explainable, and privacy-preserving digital healthcare.

"The future of healthcare isn’t about replacing doctors with AI it's about empowering them with intelligence that respects privacy, ethics, and humanity."

🛠️ Maintenance & Support

To ensure long-term usability and reliability of MEDIBOT, a structured maintenance and support protocol has been implemented.

🔄 Update Cycle

Core agents, vector database, and safety filters are updated every 4–6 weeks.
Updates include:
Bug fixes
Performance improvements
New medical knowledge sources
Security patches

📈 Monitoring & Logs

Audit logs record timestamps, agent decisions, retrieval source IDs, and errors.
Monitoring ensures quick detection of failures in:
LLM responses
Retrieval operations
Agent routing

🧪 Testing Strategy & Coverage

MEDIBOT follows a multi-layered testing workflow to guarantee accuracy, stability, and safety.

🧩 1. Unit Tests

Validate core agent logic:

Diagnosis Agent
Drug Info Agent
OCR Agent
Test prompt templates, retrieval functions, and safety checks.

🔗 2. Integration Tests

Ensure orchestrator ↔ agents communication works without deadlocks.
Validate end-to-end RAG pipeline:
embedding → FAISS retrieval → response assembly
Test interactions with external APIs (Gemini, HF models) using mocked responses.

🧭 3. End-to-End (E2E) Tests

Simulate complete user workflows:
Symptom input → diagnosis output
Drug queries → FDA label fetch
File upload → OCR + summarization
Image prompt → generated diagram

⚡ Continuous Validation

All tests run on every push.
Safety filters and PHI redaction logic undergo routine validation.
Vectorstore integrity validated each session to prevent embedding corruption.

🧷 Reliability Enhancements

MEDIBOT integrates multiple mechanisms to maintain dependable clinical performance.

✔️Input Validation

Strict validation for user messages, file uploads, and API requests.
Rejects unsupported formats, oversized files, and malformed inputs.

✔️ Health Checks

Regular system checks monitor:
LLM connectivity
Vectorstore availability
Agent responsiveness
Memory health and session stability

✔️ Retry Mechanisms

Automatic retries for API timeouts or transient failures.
Prevents crashes and maintains user experience during temporary outages.

✔️ Graceful Error Handling

Clear fallback messages when an agent encounters an error.
Error cause documented in logs.
No raw tracebacks exposed to the user.

🧑‍💻 Support & Issue Handling

As MEDIBOT is a research prototype developed for academic purposes, dedicated support channels are not provided.
However:

Bug fixes and improvements are maintained manually.
Issues are recorded internally during testing and development.
Documentation serves as the primary reference for installation, usage, and troubleshooting.

📊 Test Coverage Approach

MEDIBOT does not use automated test coverage tools such as pytest --cov or coverage.py.
Instead, the system has been rigorously manually tested during development, ensuring all components function smoothly.

✔️ Manual Testing Performed

Each agent module (Diagnosis, OCR, Drug Info, etc.) was executed and verified individually.
The orchestrator’s routing and decision flow were checked across multiple scenarios.
Retrieval operations, embeddings generation, and response assembly were tested for correctness.
Complete end-to-end workflows (symptom → output, file upload → summary, etc.) were executed multiple times without errors.
Logs and intermediate outputs were reviewed to ensure accurate and stable behavior across sessions.

✔️ Reason for No Automated Coverage

Since the system is currently stable and functioning without issues, automated coverage tools were not required.
Future versions may include automated testing for CI/CD integration, but the present version relies on carefully validated manual testing.

🧭 Significance & Implications

Screenshot (507).png
By orchestrating specialised medical agents under strict domain control, MEDIBOT demonstrates that clinically relevant, multimodal, and context-aware healthcare dialogue can be achieved without exposing data to external APIs or sacrificing patient privacy.
The system validates that retrieval-augmented reasoning and domain-filtered AI workflows can coexist in a secure, local, and explainable architecture—bridging the gap between AI innovation and medical data ethics.

🛡️ Security, Privacy & Safety

Threat Model & Trust Boundaries

Local Execution: All inference runs locally through Gemini API or local vector retrieval; no third-party data transmission by default.
Data Isolation: Uploaded medical files, chat logs, and embeddings are stored only within the project workspace (./vectorstore/ and ./outputs/).
Cost Efficiency: No recurring API usage fees for embeddings or retrieval; local FAISS ensures offline querying.
Explicit Trust Zones: UI ↔ Orchestrator ↔ Agents ↔ Vectorstore ↔ Storage — clear, auditable separation of responsibilities.

#⚙️ Controls

Input Hardening: Uploaded files validated by type (PDF, DOCX, image) and size before parsing.
Prompt Safety: Medical domain enforcement filters reject unrelated or sensitive non-medical queries.
Tool Allow-Listing: Only authorised tools (EasyOCR, BLIP, FAISS retriever, Gemini LLM) are permitted; each runs with timeouts and exception guards.
Output Filtering: Removes PHI/PII patterns, profanity, and non-medical outputs.
Secrets Handling: .env variables (API keys) loaded once at runtime; never stored in memory dumps or logs.
Logging & Audit: Structured logs include timestamps, run IDs, and agent traces; medical responses can be traced back to vector citations.
Dependency Security: Periodic pip-audit scans; minimal trusted library set.
Sandboxing: Streamlit session isolated; no shell command execution; file I/O constrained to app directories.

⚖️ Compliance Alignment

Privacy: Adheres to data-minimization and purpose-limitation principles—inputs processed locally, no telemetry sent externally.
Healthcare Ethics: Implements domain-specific content control and factual validation aligned with clinical communication norms.
GDPR & HIPAA Readiness: Local storage only, optional data-purge (./outputs/ cleanup), and transparent logging for audit trails.
Transparency: Each response includes optional reference sources from FAISS retrieval for user verification.

🔐 Security Test Cases (in Continuous Validation)

Test Case	Expected Behavior
Prompt injection or domain bypass attempts	Blocked with safety message
Oversized / malicious file uploads	Rejected with controlled error
Tool timeout or LLM failure	Graceful fallback + log entry
PHI/PII content in user text	Detected and redacted before response
File path traversal attempts	Denied, logged, no data exposure
Memory overflow or malformed vectorstore	Handled safely; recovery on next session

📜 Licensing & Usage Rights

This project is released under the GNU General Public License v3.0 (GPL-3.0)

✔️ Commercial & private use
✔️ Distribution & modification
✔️ Patent use

Permissions of this strong copyleft license are conditioned on making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights.

A full license text is available in the accompanying LICENSE file.
When redistributing, please retain copyright
and attribution notices.

Model License:

External LLMs and APIs (e.g., Google Gemini, Hugging Face, WGER, OpenFDA) are governed by their respective creators’ licenses.
Ensure compliance when integrating or extending third-party components.

🌐 Access to Technical Assets

Asset	Link / Location
Source Code	https://github.com/pamuarun/MEDIBOT-Multi-Agent-System/blob/main/app.py
Example Outputs	`outputs/` folder (chat logs, semantic evaluation reports, and AI-generated medical diagrams)