Overview
This project demonstrates how to build a Retrieval-Augmented Generation (RAG) assistant—an AI system that can answer questions about your own documents using state-of-the-art language models and vector search. The assistant leverages ChromaDB for vector storage, Sentence Transformers for embeddings, and supports multiple LLM providers (OpenAI, Groq, Google Gemini).
Features
Quick Start
Clone and Install
git clone
cd
python -m venv .venv
.venv\Scripts\activate # On Windows
pip install -r requirements.txt
Configure Environment
Create a .env file in the project root:
GROQ_API_KEY=your_groq_api_key
GROQ_MODEL=llama-3.3-70b-versatile
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
CHROMA_COLLECTION_NAME=rag_documents
You can also use OPENAI_API_KEY or GOOGLE_API_KEY if you prefer those providers.
Add Documents
Place your .txt files in the data/ directory. Example:
data/quantum_computing.txt
Run the Assistant
cd src
python app.py
How It Works
Document Loading
Documents are loaded from the data/ directory:
Customization
Change Embedding Model:
Edit EMBEDDING_MODEL in .env to use a different Sentence Transformer.
Switch LLM Provider:
Set the appropriate API key in .env (OPENAI_API_KEY, GROQ_API_KEY, or GOOGLE_API_KEY).
Tune Chunk Size:
Adjust the chunk_size parameter in chunk_text() for your document type.
Troubleshooting
Model Download Slow?
The first run downloads models from HuggingFace. Wait for completion.
Symlink Warning on Windows?
Safe to ignore, or enable Developer Mode for better caching.
Import Errors?
Ensure you run python app.py from the src directory and both app.py and vectordb.py are in src/.
Example Q&A
Enter a question or 'quit' to exit: What is quantum superposition?
AI: Quantum superposition is a fundamental principle of quantum mechanics where a quantum system can exist in multiple states at once until measured...
Conclusion
This project provides a robust, extensible template for building your own RAG-powered AI assistant.
You can easily adapt it to new document types, swap LLMs, or extend the pipeline for more advanced use cases.