
This publication presents a Retrieval Augmented Generation (RAG) based conversational assistant designed to answer questions grounded in Ready Tensor publications. The assistant integrates document ingestion, semantic embedding, and retrieval mechanisms with large language models (LLMs) to enable contextually grounded dialogue.
The system demonstrates an end to end Agentic AI pipeline that retrieves relevant publication excerpts and generates coherent, factually consistent responses, showcasing the synergy between retrieval based grounding and generative reasoning.
Agentic AI systems combine autonomous reasoning with data-grounded generation. As part of the AAIDC Module 1 – Foundations of Agentic AI, this project explores how retrieval mechanisms enhance factual accuracy in conversational models.
The RAG architecture leverages both vector-based semantic search and generative LLM reasoning, making it a cornerstone for domain specific conversational systems such as academic or research assistants. This implementation focuses specifically on Ready Tensor’s 35 AI/ML publications, enabling a question and answering interface that retrieves contextually relevant information and provides concise, reliable responses.
The goal was to design and implement a RAG based AI assistant that can answer questions over a curated corpus of Ready Tensor publications.
The assistant combines retrieval based grounding with generative reasoning to produce factual, contextually rich responses.
Specific Objectives:
Ingest 35 Ready Tensor publications (JSON format)
Generate semantic embeddings using OpenAI’s text-embedding-3-small
Implement a FAISS based retriever for efficient similarity search.
Integrate a ChatOpenAI-powered LLM to generate responses grounded in retrieved context.
Build an interactive chat interface using both Jupyter Widgets and Streamlit.
4.1 Document Ingestion
Publications were loaded from project_1_publications.json, containing titles and abstracts. Each record was merged into retrievable text blocks.
4.2 Text Chunking
The RecursiveCharacterTextSplitter divided content into 1000-character segments with 150-character overlap, resulting in 1182 optimized chunks.
4.3 Embeddings & Vector Store
Each chunk was converted into vector embeddings using OpenAI’s text-embedding-3-small, and stored in a FAISS vector database, allowing efficient semantic similarity retrieval.
4.4 Retrieval and Generation
Upon receiving a query, the retriever fetches the top 3 semantically related chunks. These are passed through a custom LangChain PromptTemplate, which provides structured contextual grounding before querying GPT-3.5-turbo.
4.5 Conversational Memory
To enable multi-turn interactions, a ConversationBufferMemory component was used to maintain conversational continuity across queries.
4.6 User Interface
Two user-facing interfaces were implemented:
Notebook Chat Widget Interface: Built using ipywidgets, providing a live chat-style notebook interface.
Streamlit Web App: Offers a modern UI for end-users to interact with the model in real time.
.png?Expires=1761611416&Key-Pair-Id=K2V2TN6YBJQHTG&Signature=RTpIka6p802-rGkSmsOu9DbKmkcJokaanRKsIm7s5HlVSfJJ0jMCpfTOnjtYaYhJtThBboFZVyI8kbXygBGcCOnAmarsgM899kPNlk3-DOLJw2bdaOUuw~4ksK3uymKApFZDOHk2vCn3QP0Ihdl4NOmrrMKggY8fT4xg3EO55FTjQs6z3ju9Tdw1zXHviWiZdH8M2y1fI~AvZbXSeRgjLXUT9t-hhwQ5Kw288UXevbZFHMoM2G8NNTL7tbb0sv5qYqo3XycnuZuYQC17ieeMKeZhr2T~TbAUvU8np25IfeBbNck-kmvq1lvb3GpWf8GCZVde94aFyi18uRqyIYnksQ__)

| Component | Description |
|---|---|
| Frameworks | LangChain, FAISS, OpenAI API, ipywidgets, Streamlit |
| Dataset | 35 Ready Tensor Publications |
| Chunk Size / Overlap | 1000 / 150 |
| Embedding Model | text-embedding-3-small |
| Vector Store | FAISS |
| LLM | GPT-3.5-turbo |
| Retriever | Top-3 semantic matches |
| Prompt Template | Context-aware publication retrieval |
| Memory | In-session persistence |
| UI | Chat widget + Streamlit |
Example Queries
“Which Ready Tensor article discusses RAG evaluation methods?”
“Who authored the Agentic AI course modules?”
“What are the goals of the AAIDC certification?”
Each query was successfully grounded in retrieved publication excerpts, ensuring that generated responses remained factual, concise, and domain-relevant
The project aligns with Ready Tensor’s Technical Evaluation Rubric (Tool / App / Software Development category)
| Evaluation Criteria | Outcome |
|---|---|
| Documentation & Code Clarity | ✅ Detailed and modular |
| Reproducibility | ✅ Dataset and parameters provided |
| End-to-End Functionality | ✅ Working RAG pipeline |
| Relevance & Utility | ✅ Domain-specific assistant |
| Compliance | ✅ CC BY–NC License and dataset alignment |
| Performance | ✅ Meets >80% of essential criteria |
Current Limitations
Limited to static corpus (no live API retrieval).
Responses are descriptive, not analytical.
Session memory resets on restart.
Future Enhancements
Incorporate ReAct or Chain-of-Thought (CoT) reasoning for deeper analysis.
Introduce persistent long-term memory.
Extend to multi-domain RAG pipelines with cross-document reasoning.
Add evaluation metrics such as retrieval precision and grounding faithfulness.
This project demonstrates the practical implementation of a Retrieval-Augmented Generation system tailored for Ready Tensor’s publication database. It successfully integrates retrieval, generation, and conversational continuity into a coherent workflow, forming a foundational prototype for future Agentic AI assistants.
The system exemplifies how LLMs, when coupled with retrieval mechanisms, can be made factually grounded, reproducible, and domain-aware.
▶️ How to Run
Upload project_1_publications.json dataset.
Set your OpenAI API
openai.api_key = userdata.get("OPENAI_API_KEY").
Run the notebook to load the FAISS vector store.
Launch the chat UI in both Chat Widget and Streamlit.
Ask questions and view real-time, citation-based responses.
Ready Tensor — Technical Evaluation Rubric for AI/ML Publications (2025)
AAIDC Module 1 — Foundations of Agentic AI, Project Guidelines
LangChain Documentation — Retrieval-Augmented Generation and Vector Stores
Ready Tensor Dataset — Publication JSON Release (2025)
OpenAI API Reference — text-embedding-3-small & gpt-3.5-turbo