AAIDC Project 1- RAG-Based Conversational Assistant for Ready Tensor Publications

ChatGPT Image Oct 27, 2025, 09_20_11 PM.png

1. Abstract

This publication presents a Retrieval Augmented Generation (RAG) based conversational assistant designed to answer questions grounded in Ready Tensor publications. The assistant integrates document ingestion, semantic embedding, and retrieval mechanisms with large language models (LLMs) to enable contextually grounded dialogue.
The system demonstrates an end to end Agentic AI pipeline that retrieves relevant publication excerpts and generates coherent, factually consistent responses, showcasing the synergy between retrieval based grounding and generative reasoning.

2. Introduction

Agentic AI systems combine autonomous reasoning with data-grounded generation. As part of the AAIDC Module 1 – Foundations of Agentic AI, this project explores how retrieval mechanisms enhance factual accuracy in conversational models.

The RAG architecture leverages both vector-based semantic search and generative LLM reasoning, making it a cornerstone for domain specific conversational systems such as academic or research assistants. This implementation focuses specifically on Ready Tensor’s 35 AI/ML publications, enabling a question and answering interface that retrieves contextually relevant information and provides concise, reliable responses.

3. Objectives

The goal was to design and implement a RAG based AI assistant that can answer questions over a curated corpus of Ready Tensor publications.
The assistant combines retrieval based grounding with generative reasoning to produce factual, contextually rich responses.

Specific Objectives:

Ingest 35 Ready Tensor publications (JSON format)
Generate semantic embeddings using OpenAI’s text-embedding-3-small
Implement a FAISS based retriever for efficient similarity search.
Integrate a ChatOpenAI-powered LLM to generate responses grounded in retrieved context.
Build an interactive chat interface using both Jupyter Widgets and Streamlit.

4. System Architecture

4.1 Document Ingestion
Publications were loaded from project_1_publications.json, containing titles and abstracts. Each record was merged into retrievable text blocks.

4.2 Text Chunking
The RecursiveCharacterTextSplitter divided content into 1000-character segments with 150-character overlap, resulting in 1182 optimized chunks.

4.3 Embeddings & Vector Store
Each chunk was converted into vector embeddings using OpenAI’s text-embedding-3-small, and stored in a FAISS vector database, allowing efficient semantic similarity retrieval.

4.4 Retrieval and Generation
Upon receiving a query, the retriever fetches the top 3 semantically related chunks. These are passed through a custom LangChain PromptTemplate, which provides structured contextual grounding before querying GPT-3.5-turbo.

4.5 Conversational Memory
To enable multi-turn interactions, a ConversationBufferMemory component was used to maintain conversational continuity across queries.

4.6 User Interface
Two user-facing interfaces were implemented:

Notebook Chat Widget Interface: Built using ipywidgets, providing a live chat-style notebook interface.
Streamlit Web App: Offers a modern UI for end-users to interact with the model in real time.

Notebook Chat Widget Interface

Screenshot (89).png

Streamlit Web App

5. Implementation Details

Component	Description
Frameworks	LangChain, FAISS, OpenAI API, ipywidgets, Streamlit
Dataset	35 Ready Tensor Publications
Chunk Size / Overlap	1000 / 150
Embedding Model	`text-embedding-3-small`
Vector Store	FAISS
LLM	GPT-3.5-turbo
Retriever	Top-3 semantic matches
Prompt Template	Context-aware publication retrieval
Memory	In-session persistence
UI	Chat widget + Streamlit

6. Experiments and Examples

Example Queries

“Which Ready Tensor article discusses RAG evaluation methods?”

“Who authored the Agentic AI course modules?”

“What are the goals of the AAIDC certification?”

Each query was successfully grounded in retrieved publication excerpts, ensuring that generated responses remained factual, concise, and domain-relevant

7. Evaluation

The project aligns with Ready Tensor’s Technical Evaluation Rubric (Tool / App / Software Development category)

Evaluation Criteria	Outcome
Documentation & Code Clarity	✅ Detailed and modular
Reproducibility	✅ Dataset and parameters provided
End-to-End Functionality	✅ Working RAG pipeline
Relevance & Utility	✅ Domain-specific assistant
Compliance	✅ CC BY–NC License and dataset alignment
Performance	✅ Meets >80% of essential criteria

8. Limitations and Future Enhancements

Current Limitations

Limited to static corpus (no live API retrieval).
Responses are descriptive, not analytical.
Session memory resets on restart.

Future Enhancements

Incorporate ReAct or Chain-of-Thought (CoT) reasoning for deeper analysis.
Introduce persistent long-term memory.
Extend to multi-domain RAG pipelines with cross-document reasoning.
Add evaluation metrics such as retrieval precision and grounding faithfulness.

9. Conclusion

This project demonstrates the practical implementation of a Retrieval-Augmented Generation system tailored for Ready Tensor’s publication database. It successfully integrates retrieval, generation, and conversational continuity into a coherent workflow, forming a foundational prototype for future Agentic AI assistants.

The system exemplifies how LLMs, when coupled with retrieval mechanisms, can be made factually grounded, reproducible, and domain-aware.

▶️ How to Run

Upload project_1_publications.json dataset.
Set your OpenAI API

openai.api_key = userdata.get("OPENAI_API_KEY").

Run the notebook to load the FAISS vector store.
Launch the chat UI in both Chat Widget and Streamlit.
Ask questions and view real-time, citation-based responses.

10. References

Ready Tensor — Technical Evaluation Rubric for AI/ML Publications (2025)
AAIDC Module 1 — Foundations of Agentic AI, Project Guidelines
LangChain Documentation — Retrieval-Augmented Generation and Vector Stores
Ready Tensor Dataset — Publication JSON Release (2025)
OpenAI API Reference — text-embedding-3-small & gpt-3.5-turbo

ChatGPT Image Oct 27, 2025, 09_20_11 PM.png

1. Abstract

2. Introduction

3. Objectives

Specific Objectives:

Ingest 35 Ready Tensor publications (JSON format)
Generate semantic embeddings using OpenAI’s text-embedding-3-small
Implement a FAISS based retriever for efficient similarity search.
Integrate a ChatOpenAI-powered LLM to generate responses grounded in retrieved context.
Build an interactive chat interface using both Jupyter Widgets and Streamlit.

4. System Architecture

4.1 Document Ingestion
Publications were loaded from project_1_publications.json, containing titles and abstracts. Each record was merged into retrievable text blocks.

4.2 Text Chunking
The RecursiveCharacterTextSplitter divided content into 1000-character segments with 150-character overlap, resulting in 1182 optimized chunks.

4.5 Conversational Memory
To enable multi-turn interactions, a ConversationBufferMemory component was used to maintain conversational continuity across queries.

4.6 User Interface
Two user-facing interfaces were implemented:

Notebook Chat Widget Interface: Built using ipywidgets, providing a live chat-style notebook interface.
Streamlit Web App: Offers a modern UI for end-users to interact with the model in real time.

Notebook Chat Widget Interface

Screenshot (89).png

Streamlit Web App

5. Implementation Details

Component	Description
Frameworks	LangChain, FAISS, OpenAI API, ipywidgets, Streamlit
Dataset	35 Ready Tensor Publications
Chunk Size / Overlap	1000 / 150
Embedding Model	`text-embedding-3-small`
Vector Store	FAISS
LLM	GPT-3.5-turbo
Retriever	Top-3 semantic matches
Prompt Template	Context-aware publication retrieval
Memory	In-session persistence
UI	Chat widget + Streamlit

6. Experiments and Examples

Example Queries

“Which Ready Tensor article discusses RAG evaluation methods?”

“Who authored the Agentic AI course modules?”

“What are the goals of the AAIDC certification?”

Each query was successfully grounded in retrieved publication excerpts, ensuring that generated responses remained factual, concise, and domain-relevant

7. Evaluation

The project aligns with Ready Tensor’s Technical Evaluation Rubric (Tool / App / Software Development category)

Evaluation Criteria	Outcome
Documentation & Code Clarity	✅ Detailed and modular
Reproducibility	✅ Dataset and parameters provided
End-to-End Functionality	✅ Working RAG pipeline
Relevance & Utility	✅ Domain-specific assistant
Compliance	✅ CC BY–NC License and dataset alignment
Performance	✅ Meets >80% of essential criteria

8. Limitations and Future Enhancements

Current Limitations

Limited to static corpus (no live API retrieval).
Responses are descriptive, not analytical.
Session memory resets on restart.

Future Enhancements

Incorporate ReAct or Chain-of-Thought (CoT) reasoning for deeper analysis.
Introduce persistent long-term memory.
Extend to multi-domain RAG pipelines with cross-document reasoning.
Add evaluation metrics such as retrieval precision and grounding faithfulness.

9. Conclusion

The system exemplifies how LLMs, when coupled with retrieval mechanisms, can be made factually grounded, reproducible, and domain-aware.

▶️ How to Run

Upload project_1_publications.json dataset.
Set your OpenAI API

openai.api_key = userdata.get("OPENAI_API_KEY").

Run the notebook to load the FAISS vector store.
Launch the chat UI in both Chat Widget and Streamlit.
Ask questions and view real-time, citation-based responses.

10. References

Ready Tensor — Technical Evaluation Rubric for AI/ML Publications (2025)
AAIDC Module 1 — Foundations of Agentic AI, Project Guidelines
LangChain Documentation — Retrieval-Augmented Generation and Vector Stores
Ready Tensor Dataset — Publication JSON Release (2025)
OpenAI API Reference — text-embedding-3-small & gpt-3.5-turbo

AAIDC Project 1- RAG-Based Conversational Assistant for Ready Tensor Publications

Table of contents

1. Abstract

2. Introduction

3. Objectives

4. System Architecture

Notebook Chat Widget Interface

Streamlit Web App

5. Implementation Details

6. Experiments and Examples

7. Evaluation

8. Limitations and Future Enhancements

9. Conclusion

10. References

Table of contents

1. Abstract

2. Introduction

3. Objectives

4. System Architecture

Notebook Chat Widget Interface

Streamlit Web App

5. Implementation Details

6. Experiments and Examples

7. Evaluation

8. Limitations and Future Enhancements

9. Conclusion

10. References

Code

Code