
Production-Ready RAG System with Enterprise Security
A production-grade Retrieval-Augmented Generation (RAG) system powered by Google Gemini 2.5 Flash, featuring enterprise-level security, observability, and monitoring capabilities.
This project demonstrates a production-ready RAG system built with modern AI technologies and enterprise best practices. It combines intelligent question answering with comprehensive security, monitoring, and reliability features suitable for real-world deployment.
Retrieval-Augmented Generation (RAG) enhances Large Language Models by retrieving relevant information from a knowledge base before generating responses, resulting in more accurate and contextually grounded answers.
| Feature | Description |
|---|---|
| Guardrails AI | Real-time validation to prevent harmful content, PII leakage, and prompt injections |
| Rate Limiting | Token bucket algorithm to prevent API abuse and control costs |
| Audit Logging | Comprehensive structured logging for compliance and security monitoring |
| Input Validation | Multi-layer checks before queries reach the LLM |
| Output Validation | Automated screening of LLM responses before displaying to users |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User Interface (Streamlit) β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββ΄βββββββββββββββ
β β
βββββββββΌβββββββββ ββββββββββΌβββββββββ
β Security Layer β β Observability β
β - Guardrails β β - Monitoring β
β - Rate Limit β β - Tracing β
β - Audit Logs β β - Metrics β
βββββββββ¬βββββββββ ββββββββββ¬βββββββββ
β β
ββββββββββββββββ¬βββββββββββββββ
β
ββββββββββΌββββββββββ
β RAG Pipeline β
β Gemini 2.5 Flashβ
ββββββββββ¬ββββββββββ
β
ββββββββββββββββ΄βββββββββββββββ
β β
βββββββββΌβββββββββ ββββββββββΌβββββββββ
β Vector Search β β Knowledge Base β
β (FAISS) β β (Wikipedia) β
ββββββββββββββββββ βββββββββββββββββββ
git clone <repository-url> cd sentinelrag
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
Create a .env file in the project root:
# Required: Google Gemini API GOOGLE_API_KEY=your_google_api_key_here # Optional: Langfuse for LLM tracing (advanced) LANGFUSE_PUBLIC_KEY=pk_... LANGFUSE_SECRET_KEY=sk_... LANGFUSE_HOST=https://cloud.langfuse.com
streamlit run app_gemini.py --server.headless=true --server.port=8501
Access at: http://localhost:8501
streamlit run app_gemini_secure.py --server.headless=true --server.port=8502
Access at: http://localhost:8502
Example queries:
The system uses Gemini 2.5 Flash by default, offering:
In the Secure Version (app_gemini_secure.py), you can configure:
| Setting | Description | Default |
|---|---|---|
| Enable Guardrails AI | Input/output validation | β Enabled |
| Enable Rate Limiting | Prevent API abuse | β Enabled |
| Enable Audit Logging | Security event tracking | β Enabled |
When enabled, Guardrails AI automatically protects against:
| Component | Typical Time |
|---|---|
| Initial Load (first time) | 30-60 seconds |
| Query Response | 2-5 seconds |
| Observability Overhead | ~100ms (7-13%) |
Google Gemini 2.5 Flash Pricing:
Estimated Costs:
Track real-time system performance:
# Start metrics collection python -m src.monitoring.health_endpoint # View metrics curl http://localhost:8080/metrics
Visualize system health and performance:
# Install Grafana (macOS) brew install grafana brew services start grafana # Access dashboard open http://localhost:3000 # Import: config/grafana/dashboards/rag_overview.json
Dashboard includes:
Verify the installation:
# Activate virtual environment source venv/bin/activate # Run test suite python test_week2_modules.py
Expected output:
======================================================================
Test Summary
======================================================================
Passed: 18
Failed: 0
Skipped: 0
Total: 18
β
All tests passed!
Success Rate: 100.0%
Solution:
# Ensure virtual environment is activated source venv/bin/activate # On Windows: venv\Scripts\activate # Reinstall dependencies pip install -r requirements.txt
Solution:
.env fileSolution:
# Check if port is already in use lsof -i :8501 # or :8502 for secure version # Kill existing process if needed pkill -f streamlit # Restart application streamlit run app_gemini.py
This is normal! Langfuse is an optional feature for advanced LLM tracing. The system works perfectly without it. To enable:
export LANGFUSE_PUBLIC_KEY="pk_..." export LANGFUSE_SECRET_KEY="sk_..." export LANGFUSE_HOST="https://cloud.langfuse.com"
sentinelrag/
βββ π± app_gemini.py # Basic Streamlit application
βββ π app_gemini_secure.py # Secure version with Guardrails
βββ π¦ src/
β βββ rag_pipeline_gemini.py # Core RAG implementation
β βββ rag_pipeline_with_guardrails.py # Secure RAG pipeline
β βββ vector_store.py # FAISS vector database
β βββ data_collector.py # Knowledge base builder
β βββ π‘οΈ security/ # Security features
β β βββ guardrails_integration.py
β β βββ rate_limiter.py
β β βββ audit_logger.py
β βββ π observability/ # Monitoring & metrics
β β βββ hallucination_detector.py
β β βββ cost_calculator.py
β β βββ latency_tracker.py
β β βββ metrics_registry.py
β βββ β‘ resilience/ # Reliability features
β β βββ retry_policy.py
β β βββ timeout_manager.py
β βββ π₯ monitoring/ # Health checks
β βββ health_endpoint.py
βββ π§ͺ test_week2_modules.py # Test suite
βββ βοΈ config/ # Configuration files
β βββ prometheus/
β βββ grafana/
βββ π requirements.txt # Dependencies
βββ π README.md # This file
Guardrails AI provides enterprise-grade safety for all LLM interactions:
Input Protection:
Output Protection:
Monitoring & Compliance:
| Provider | Model | Cost per 1K queries | Monthly (30K queries) |
|---|---|---|---|
| Google Gemini | 2.5 Flash | $0.33 | $10 |
| OpenAI | GPT-3.5 Turbo | $0.75 | $22.50 |
| OpenAI | GPT-4 Turbo | $3.50 | $105 |
| Anthropic | Claude Instant | $0.80 | $24 |
β Gemini 2.5 Flash offers the best price-performance ratio
Major Updates:
Bug Fixes:
Additional documentation available:
Contributions are welcome! This project is designed for learning and improvement.
git checkout -b feature/amazing-feature)git commit -m 'Add amazing feature')git push origin feature/amazing-feature)This project is licensed under the MIT License - see the LICENSE file for details.
You are free to:
We're open to suggestions! Please open an issue to propose new features.
If you find this project helpful, please consider giving it a star! β