A Retrieval-Augmented Generation (RAG) powered AI assistant built using LangChain and Vector Databases (Chroma/FAISS) that can answer questions based on your own documents.
Developed as part of Agentic AI Essentials Certificationion Program”, a project under the Agentic AI Essentials program.
This project demonstrates how Large Language Models (LLMs) can be enhanced through Retrieval-Augmented Generation (RAG) — a powerful technique where an LLM doesn’t rely only on its internal training data, but also retrieves relevant information from external knowledge sources (your own documents).
🔍 In simple terms:
It’s like ChatGPT, but it knows about your files — PDFs, text, notes — and gives accurate, context-specific answers.
This assistant can run entirely from your local system and interact through a Command-Line Interface (CLI) or optionally a Streamlit UI for a smoother user experience.
Traditional LLMs like GPT or Gemini have limitations — they can “hallucinate” or give outdated answers because their knowledge cutoff date is fixed.
RAG solves this by:
This way, your assistant always produces accurate, up-to-date, and grounded responses based on your data.
A Retrieval-Augmented Generation (RAG) system bridges the gap between static model knowledge and dynamic, domain-specific data.
Here’s how your assistant works under the hood:
The assistant begins by reading all the files stored inside the data/ directory — these can be PDFs, text files, or markdown notes.
This step converts unstructured human-readable data into a digital format ready for processing.
Why this matters:
LLMs cannot “read” PDFs or large files directly. By preprocessing the documents, we prepare them for meaningful retrieval and embedding.
Large documents are split into smaller, semantically meaningful pieces (called chunks).
For example, a 20-page PDF might become 200 short text chunks of 200–500 tokens each.
Why this matters:
LLMs have context limits (e.g., 8K or 16K tokens).
Chunking allows efficient retrieval — so when a question is asked, only the most relevant parts are considered.
Each text chunk is passed through a Sentence Transformer or an embedding API (like OpenAI’s text-embedding-3-small).
This converts text into a vector — a list of numbers representing semantic meaning.
Why this matters:
Vectors allow computers to “understand” the similarity between texts.
Two chunks with similar meaning will have vectors that are close together in multidimensional space.
All embeddings (vectors) are stored in a vector database, such as ChromaDB or FAISS.
These databases are optimized for fast similarity search — finding which vectors (chunks) are closest to a query vector.
Why this matters:
Instead of scanning entire documents every time, the system can instantly retrieve only the most relevant text snippets.
When a user asks a question, the system:
The top-matched chunks are then retrieved as the context for the model.
Why this matters:
This allows the assistant to answer questions using your documents instead of relying only on the model’s pretraining.
The retrieved context and the user’s question are combined into a structured prompt template —
something like:
“You are an AI assistant. Use the following context to answer the question accurately.
Context: [retrieved chunks]
Question: [user query]”
Why this matters:
Good prompt engineering ensures the LLM produces concise, grounded, and hallucination-free responses.
Finally, the prompt is sent to an LLM backend — such as OpenAI GPT, Groq, or Google Gemini.
The model processes both the question and retrieved context to produce an accurate, human-like answer.
Why this matters:
This is where generation happens — the assistant doesn’t “memorize” answers but reasons over context, creating a dynamic and trustworthy Q&A system.
You can iteratively improve the system by:
data/.This makes the assistant smarter and more aligned with your specific knowledge base over time.
| Component | Technology | Purpose |
|---|---|---|
| Framework | LangChain | Handles chaining of retrieval & generation logic |
| Vector Store | Chroma / FAISS | Stores embeddings for semantic search |
| Embeddings | Sentence Transformers or LLM APIs | Converts text chunks into numerical vectors |
| LLM Backend | OpenAI, Groq, or Google Gemini | Generates final context-grounded answers |
| Interface | CLI / Streamlit | For user interaction |
| Language | Python 3.10+ | Core programming language |
rt-aaidc-module1/
├── src/
│ ├── app.py # Main RAG application
│ └── vectordb.py # Vector database wrapper
├── data/ # Contains documnets
│ ├── *.txt # Contains text files
├── requirements.txt # All dependencies included
└── README.md # This guide
This project was created as part of the ReadyTensor AI Applied Intelligence Developer Certification (AAIDC) program —
a structured and mentor-guided course designed to transform learners from LLM users into AI system builders.
The AAIDC (Applied AI Developer Certification) by ReadyTensor focuses on helping developers:
This module introduces the core architecture behind Retrieval-Augmented Generation (RAG).
Students learn to:
By completing this module, I gained a deep, practical understanding of:
🧩 In essence, this project marks my first milestone in building real-world AI systems — not just prompting them.
By completing this RAG-based AI Assistant project as part of the ReadyTensor AAIDC Program, I achieved both technical mastery and conceptual understanding of modern AI assistant design.
Below are the major outcomes from this project:
Follow the steps below to set up and run the project locally:
git clone https://github.com/krishpansara/rt-aaidc-module1/ cd rt-aaidc-module1
It’s recommended to create a virtual environment to isolate project dependencies.
python -m venv venv venv\Scripts\activate
python3 -m venv venv source venv/bin/activate
Once the virtual environment is activated, install all dependencies using:
pip install -r requirements.txt
This project supports multiple LLM providers (OpenAI, Groq, Google).
You need to set your API keys before running the app.
After creating the Virtual Environment your project containes .env file in the project root directory add your API keys in the file as shown below:
# .env file # Example: use one or more providers depending on your setup OPENAI_API_KEY=your_openai_api_key_here GROQ_API_KEY=your_groq_api_key_here GOOGLE_API_KEY=your_google_api_key_here
python app.py
Special thanks to ReadyTensor.ai for providing structured learning and the RAG project template.
Inspired by the open-source AI community and the LangChain ecosystem.
🧭 In summary:
This project strengthened my ability to build end-to-end AI systems, combining data engineering, machine learning, and software development.
It represents a foundational step toward becoming a practical AI/ML engineer capable of developing custom knowledge-grounded assistants.