AAIDC 2025
Eshaal Zehra
This project implements a lightweight Retrieval-Augmented Generation (RAG) assistant designed for the AAIDC Project 1: Foundations of Agentic AI. Unlike typical RAG pipelines that rely on large models, embeddings, and heavy frameworks such as LangChain or HuggingFace, this system demonstrates the core principles of RAG using only Python, scikit-learn TF-IDF, and optional FAISS for similarity search.
The assistant ingests plain-text documents, splits them into chunks, indexes them using vector representations, and answers natural language queries by retrieving relevant context and generating rule-based responses. This minimal implementation provides an educational baseline for understanding the mechanics of RAG without requiring GPUs or proprietary APIs.
RAG systems combine retrieval (finding relevant knowledge) with generation (producing an answer). Most modern implementations use large language models (LLMs) like GPT, combined with dense embeddings and vector databases. However, these approaches can be heavy for learners or environments with limited compute.
This project addresses the need for a lightweight, dependency-minimal RAG assistant that demonstrates the concept clearly. By using TF-IDF vectors, cosine similarity, and a small rule-based knowledge base, it enables:
Fast prototyping without GPUs or external APIs.
Transparency in how documents are split, indexed, and retrieved.
Educational insight into RAG mechanics for beginners.
The assistant allows users to query a small document collection (sample files on LangChain, vector databases, and RAG itself) and receive grounded responses.
The assistant follows a 4-step pipeline:
Loads .txt files from a given directory.
Wraps each file as a Document object with metadata (source filename).
Implements a custom SimpleTextSplitter that chunks text into manageable segments (~1000 characters).
Preserves overlap between chunks for better context.
Text โ Sentences โ Chunks โ Indexed Documents
Default Mode: scikit-learn TfidfVectorizer creates sparse vectors.
Optional Mode: FAISS (if available) provides efficient similarity search.
Stores both the chunked documents and their vector representations.
Retrieval: Given a query, the system computes cosine similarity against all document vectors and selects the top-k chunks.
Response Generation:
Rule-based answers for known topics (LangChain, FAISS, RAG, vector databases).
Otherwise, extracts sentences from retrieved context.
Provides sources, similarity scores, and timestamps for transparency.
The system runs entirely from the command line (CLI):
Start the assistant via python main.py.
Users can ask natural language questions.
Commands include:
quit โ exit
history โ show past Q&A
examples โ display suggested queries
Sample example questions provided:
โWhat is LangChain?โ
โWhich vector databases are mentioned?โ
โHow does RAG work?โ
Setup
Dataset: 3 sample .txt files (LangChain overview, Vector databases guide, RAG systems explained).
Indexing: TF-IDF vectors (5000 max features, unigrams & bigrams).
Retrieval: cosine similarity, top-3 chunks.
Example Queries & Responses
Q: What is RAG?
A: Retrieval-Augmented Generation combines LLMs with external knowledge retrieval. Based on the documents: โRAG reduces hallucinations by grounding responses in real documents...โ
Q: Which vector databases are mentioned?
A: Popular ones include FAISS, Pinecone, Weaviate, Chroma, and Qdrant.
Q: Tell me about LangChain components.
A: LangChain includes Prompts, Models, Chains, Agents, and Memory. From the documents: โLangChain supports integration with OpenAI GPT models and Hugging Face transformers...โ
Retrieval Quality: TF-IDF was sufficient to return semantically related chunks for technical queries.
Response Quality: Rule-based augmentation ensured concise and accurate answers for core topics.
Usability: CLI worked smoothly; history and example features helped with exploration.
Limitations:
No generative LLM โ answers are extractive and rule-based.
Performance decreases for queries outside the dataset.
Does not yet support advanced reasoning (ReAct, chain-of-thought).
This project demonstrates the foundations of RAG without heavy dependencies. By combining TF-IDF retrieval with lightweight response rules, it provides a clear, educational showcase of:
Document ingestion, chunking, and vector indexing.
Retrieval-based question answering.
Extensible architecture for future improvements.
Integrate modern embeddings (sentence-transformers).
Support FAISS as the default retriever.
Add session memory for conversational context.
Extend to a web-based UI (Gradio/Streamlit).