RAG-v1 is a modular, production-ready Retrieval-Augmented Generation (RAG) system for document-based question answering. It integrates ChromaDB for vector search, LangChain for orchestration, and GROQ for large language model (LLM) inference. The system supports both a modern Streamlit web interface and a comprehensive command-line interface (CLI), enabling flexible, scalable, and secure document ingestion, retrieval, and conversational AI workflows.
1. Introduction and Motivation
The exponential growth of unstructured data has created a need for systems that can efficiently retrieve and synthesize information from large document collections. RAG-v1 addresses this by combining state-of-the-art vector search with LLMs, providing both technical and non-technical users with powerful tools for document-based Q&A, analytics, and knowledge management.
2. System Architecture
2.1 High-Level Overview
Figure 1: High-level system architecture showing all major components and their relationships
The RAG-v1 system is organized into five main layers:
User Interface Layer: Dual interfaceโStreamlit web UI and CLIโfor all operations.
Application Layer: Core processing engine with configuration and logging management.
Data Processing Layer: Document loading, text processing, and vector database operations.
External Services: LLM and embedding model integrations.
Data Storage: File system organization for raw documents, processed data, vectors, and logs.
2.2 Component Diagram
Figure 2: System Architecture showing the main components and their interactions
3. Key Modules and Their Roles
main.py: CLI entry point; orchestrates all backend operations.
app.py: Streamlit web interface; mirrors CLI functionality with a modern UI.
src/rag_pipeline.py: Core pipeline; manages ingestion, retrieval, and LLM calls.
src/ingestion/document_loader.py: Loads and parses PDF, DOCX, TXT, and JSON files.
src/utils/config_loader.py: Loads YAML config with dot-notation access.
src/utils/init_manager.py: Initializes logging, loads .env, and sets up environment.
src/utils/log_manager.py: Handles timestamped log file creation and management.
4. Data Flow and Processing Pipeline
Figure 3: Data flow sequence showing the end-to-end process from document ingestion to query response
5. Security and Configuration
Secrets Management: All API keys (e.g., GROQ_API_KEY) are loaded from a .env file (see .env.example).
Configuration: System behavior is controlled via config/config.yaml (logging, LLM, vector DB, etc.).
Best Practices: .env is git-ignored; .env.example is provided for onboarding.
6. Extensibility and Customization
Figure 4: Extensibility diagram showing pluggable components and configuration-driven architecture
Pluggable Embeddings: Swap HuggingFace models via config.
LLM Agnostic: Easily switch LLM providers by updating config and .env.
Custom Ingestion: Extend document_loader.py for new file types.
UI/UX: Add new Streamlit pages or CLI commands as needed.
7. CLI and Web UI Design
CLI: Supports all operations (init, ingest, query, stats, clear, logs, etc.) with rich help and examples.
Web UI: Streamlit app with dashboard, chat, ingestion, stats, and log management.
Baljit Oberoi is a Technical Project/Program Management Consultant with extensive experience delivering solutions in Data, Analytics, and AI. He is an AI enthusiast with a strong interest in exploring how Deep Learning can be applied to predict financial market movements.