By Nwabuisi Chizaram
_Agentic AI Developer Certification Program | Module 1 Project
Every day, professionals, students, and researchers drown in PDFs, academic papers, manuals, policy documents, reports. Buried within these documents are answers we seek, yet we often lack the time or patience to dig through 50+ pages of dense information. I found myself asking:
_“What if we could just talk to a document as easily as we talk to ChatGPT and get accurate, contextual answers sourced directly from the text?”
That question led me to build the AskYourPDF RAG Assistant — an intelligent document Q&A system that allows users to upload any PDF and instantly ask questions in natural language. Powered by LangChain, FAISS, and OpenAI’s GPT-4, and wrapped in a clean Streamlit interface, the assistant was my first submission for the Agentic AI Developer Certification Program by Ready Tensor.
What follows is the story of how I built it from idea to deployment.
Large Language Models (LLMs) like GPT-4 are undeniably powerful. They can reason, summarize, and answer questions with remarkable fluency. But they suffer from one major drawback: they don’t “know” your documents unless those documents are explicitly included in the prompt or training data. That’s a problem if you're trying to query a niche report, a company manual, or a recent academic article.
This is where Retrieval-Augmented Generation (RAG) comes in. Instead of relying solely on a pre-trained model, RAG enhances LLMs by retrieving relevant text snippets from external sources — like your uploaded PDF — and appending them as context before the LLM generates a response. It’s like giving GPT-4 glasses to actually read your document before answering.
The goal of my project was to make this power accessible to everyone — no coding skills required.
I started by mapping the user journey: someone uploads a PDF, types a question, and gets an answer with citations. Simple in theory, but behind the scenes, there were several moving parts to orchestrate. I chose tools that would let me build fast, iterate faster, and scale gracefully:
Once a user uploads a PDF, the assistant begins by extracting the raw text using PyMuPDF. That text is then broken down into coherent chunks using LangChain’s RecursiveCharacterTextSplitter
, which ensures semantic meaning is preserved across each segment.
Each chunk is passed through OpenAI’s Embedding API and transformed into a vector — a numerical representation of its semantic meaning. These vectors are stored in a FAISS index, enabling lightning-fast similarity search.
When a user submits a question, the system performs a vector similarity search to identify the most relevant chunks from the document. These retrieved segments, along with the question, are then passed into a LangChain QA chain, which wraps GPT-4 with a prompt that encourages grounded, referenceable answers.
Finally, the result is displayed in the Streamlit interface — complete with citations pointing to the exact document snippets used for the answer.
To test the assistant, I uploaded the foundational NLP paper “Attention Is All You Need”. I asked it questions like:
“What is the purpose of positional encoding in transformers?”
“How does the model handle long-range dependencies?”
The assistant answered clearly, referencing the appropriate section of the paper. That was the moment I realized the power of this tool.
Here are other real-world scenarios I envision:
One of the early challenges I faced was tuning the chunk size and overlap. Small chunks lacked context; large chunks diluted relevance. After experimentation, I found a balanced configuration that preserved semantic integrity while optimizing retrieval quality.
Another key lesson was handling out-of-scope questions. Initially, GPT-4 would hallucinate confidently if no relevant content was found. I addressed this by refining the system prompt to instruct the model to admit uncertainty if no answer could be found — drastically improving trust.
I also had to ensure API security. During development, .env
files managed local secrets, while in production, I used Streamlit Secrets Manager, keeping credentials hidden and the app secure.
Unlike generic AI chat tools, this assistant is context-aware and document-grounded. It doesn’t just generate an answer — it shows exactly where the answer came from. This is critical for knowledge workers, researchers, and students who demand transparency and trust.
It’s also accessible. There’s no setup or configuration required — just upload your PDF and start asking questions.
The assistant performed exceptionally well on technical, academic, and business PDFs. It returned coherent, relevant answers to in-scope questions and gracefully handled queries that weren’t covered in the text.
This project served as my capstone for Module 1 of the Agentic AI Developer Certification Program, and it reinforced my belief that LLMs become exponentially more useful when they are grounded in user-specific content.
You can explore the full source code and architecture on GitHub:
👉 https://github.com/Chizzy0428/RAG-Assistant-Project/tree/main
I’m currently exploring the following enhancements:
This is just the beginning.
Nwabuisi Chizaram is a Data Scientist, Educator, and AI Developer passionate about building tools that make intelligence accessible. His work focuses on NLP, agentic systems, and real-world AI-powered applications for education, productivity, and research.
API keys are stored securely using .env
in development and Streamlit Secrets
in production. No secrets are ever exposed in the codebase.
“We’re not just building apps that answer questions — we’re building assistants that understand your world.”
Source Code
Explore the source code and architecture here:
https://github.com/Chizzy0428/RAG-Assistant-Project/tree/main