A Safety-Aware Document-Grounded Chatbot using Groq and ChromaDB

This project implements a Retrieval-Augmented Generation (RAG) assistant that combines vector-based document retrieval with large language model (LLM) inference to produce grounded, context-aware responses.
Unlike standalone LLMs that may hallucinate, this system retrieves relevant information from an external document corpus before generating answers, improving factual accuracy, transparency, and reliability.
The system also incorporates basic safety guardrails to support responsible usage.
The RAG assistant follows a modular pipeline:
1.Document ingestion and preprocessing
2.Dense vector embedding and persistent storage
3.Safety-aware query handling
4.Similarity-based retrieval
5.Context-conditioned response generation
This design makes the system suitable for academic, research, and domain-specific applications.
project/
├── LICENSE
├── app.py # Main application entry point
├── vectordb.py # Vector database (ChromaDB) wrapper
├── safety.py # Safety and content guardrails
├── requirements.txt # Project dependencies
├── README.md # Project documentation
├── images/ # Visual assets
│ ├── rag_cover.png
│ ├── rag_architecture.png
│ ├── methodology_flow.png
├── data/ # Input .txt documents
└── rag_clients_db/ # Persistent ChromaDB storage


To ensure responsible usage, the system includes:
-Rule-based filtering for unsafe or restricted queries
-Conservative fallback responses when information is unavailable
-Prompt instructions to avoid hallucination
These safeguards provide a foundation for future enhancements such as neural moderation and audit logging.
-Python 3.10 or higher
-Groq API key
-Basic Python knowledge
-Sufficient memory for embeddings
git clone https://github.com/manishreddygutha-spec/RAG-Assistant.git
cd project
python -m venv .venv source .venv/bin/activate # Linux/macOS .venv\Scripts\Activate.ps1 # Windows
pip install -r requirements.txt
GROQ_API_KEY=your_api_key_here
Place non-empty .txt files inside the data/ directory.
python app.py
This project is intended for educational and research purposes.
Users are responsible for ensuring ethical, legal, and compliant usage in real-world deployments.