A Retrieval-Augmented Generation (RAG) assistant that answers user queries using a collection of local documents. It retrieves relevant text chunks from your dataset and generates answers using LLMs.
This RAG Assistant is designed for:
.txt documents,It is not yet optimized for:
.txt documents using embeddings.gpt-3.5-turbo, gpt-4o-mini)The workflow:
User Query
β
βΌ
Retrieve relevant document chunks
β
βΌ
Combine context
β
βΌ
Send to LLM (OpenAI / Groq / Google Gemini)
β
βΌ
Return generated answer
.txt files from a content directory.n_results relevant chunks per query.Chunking breaks large text files into smaller pieces that can be embedded and retrieved efficiently β too large and context becomes diluted; too small and meaning may be lost.
In this project, we use fixed-size chunking with overlap, where each document is split into chunks of ~300β400 tokens with a 20β30% overlap between chunks. This preserves context across boundaries and ensures that important sentences arenβt split mid-thought.
The sliding overlap helps:
| Backend | Free Tier Options | Notes |
|---|---|---|
| OpenAI GPT | Free trial credits | 429 errors if quota exceeded |
| Groq LLaMA 3.1 | Free trial / local testing | Supports instant and batch inference |
| Google Gemini | Free-tier API access | May require account setup and quota limits |
Recommended: manage quotas carefully or use local Groq models for higher throughput.
from rag_assistant import RAGAssistant rag = RAGAssistant() while True: query = input("Enter a question or 'quit' to exit: ") if query.lower() == "quit": break result = rag.query(query) print("Answer:", result["answer"])
Enter a question or 'quit' to exit: What is an asteroid?
Answer: An asteroid is a small rocky body orbiting the Sun, primarily found in the asteroid belt between Mars and Jupiter.
"test"), consider skipping retrieval or returning a generic response.n_results to limit irrelevant chunks.openai (for OpenAI API)chromadb or other vector database backend (for Groq / embeddings)sentence-transformers (for embeddings)python-dotenv (for .env loading)langchain (optional, for prompt chaining)Install dependencies:
pip install openai chromadb sentence-transformers python-dotenv langchain
MIT License