Oct 24, 2025●2 reads

Simple RAG System for Questions and Answers

m
Shahzaib Saleem

Purpose:

This document describes a compact, production-ready Retrieval-Augmented Generation (RAG) system that answers user queries based on an indexed corpus of documents. The system uses chunked documents stored in a vector database for retrieval, a strong system prompt for the Groq LLM, and a short-term memory component that stores previous interactions (responses & context) to maintain conversation state.

Overview

A RAG system augments an LLM with a retrieval step: relevant document passages are retrieved from a vector DB and passed to the LLM so that answers are grounded in the supplied corpus. Key goals for this simple system:

-Accurate, citation-friendly answers drawn from documents.

-Strong system prompt to constrain Groq LLM behavior (tone, citation style, hallucination mitigation).

-Short-term memory to store recent user messages and LLM outputs (improve coherence).

Scalable vector storage of chunked document embeddings.

Components & responsibilities

API Layer

-Receives user requests (chat or single-turn query).
-Authentication, rate limiting, telemetry.

Ingestion Pipeline

-Accepts raw docs (PDF, HTML, txt).

-Normalizes, splits into chunks, computes embeddings, writes to vector DB with metadata.

Vector DB

-Stores chunk embeddings + metadata. Supports k-NN search (ANN), filtering by metadata.

Retriever

-Executes semantic search using query embedding; returns top-K chunks with relevancy scores.

Context Assembler

-Builds the final prompt: system prompt, retrieved contexts, recent memory, user query.

-Applies length / token budget management (truncate less relevant items).

##Groq LLM Connector

-Sends assembled prompt to Groq LLM and receives response.

-Handles retries, timeouts.

Memory Store

-Stores recent interactions in a fast store. Used to include prior responses in next prompts.

Post-Processor

-Adds citations, enforces safety rules, optionally extracts structured answers.

Screenshot 2025-10-24 163624.png Screenshot 2025-10-24 163553.png Screenshot 2025-10-24 163527.png Screenshot 2025-10-24 163512.png