Jun 12, 2025●17 reads●MIT License

Ask Your PDFs Anything: Building an Intelligent RAG Assistant with LangChain, FAISS & Streamlit

AAIDC2025
RAG

z
@zbuisi

Ask Your PDFs Anything: How I Built an Intelligent RAG Assistant Using LangChain, FAISS, GPT-4, and Streamlit

By Nwabuisi Chizaram
_Agentic AI Developer Certification Program | Module 1 Project

Introduction

Every day, professionals, students, and researchers drown in PDFs, academic papers, manuals, policy documents, reports. Buried within these documents are answers we seek, yet we often lack the time or patience to dig through 50+ pages of dense information. I found myself asking:

_“What if we could just talk to a document as easily as we talk to ChatGPT and get accurate, contextual answers sourced directly from the text?”

That question led me to build the AskYourPDF RAG Assistant — an intelligent document Q&A system that allows users to upload any PDF and instantly ask questions in natural language. Powered by LangChain, FAISS, and OpenAI’s GPT-4, and wrapped in a clean Streamlit interface, the assistant was my first submission for the Agentic AI Developer Certification Program by Ready Tensor.

What follows is the story of how I built it from idea to deployment.

❓ Why Retrieval-Augmented Generation (RAG)?

Large Language Models (LLMs) like GPT-4 are undeniably powerful. They can reason, summarize, and answer questions with remarkable fluency. But they suffer from one major drawback: they don’t “know” your documents unless those documents are explicitly included in the prompt or training data. That’s a problem if you're trying to query a niche report, a company manual, or a recent academic article.

This is where Retrieval-Augmented Generation (RAG) comes in. Instead of relying solely on a pre-trained model, RAG enhances LLMs by retrieving relevant text snippets from external sources — like your uploaded PDF — and appending them as context before the LLM generates a response. It’s like giving GPT-4 glasses to actually read your document before answering.

The goal of my project was to make this power accessible to everyone — no coding skills required.

⚙️ Designing the Solution

I started by mapping the user journey: someone uploads a PDF, types a question, and gets an answer with citations. Simple in theory, but behind the scenes, there were several moving parts to orchestrate. I chose tools that would let me build fast, iterate faster, and scale gracefully:

LangChain: Orchestration of document parsing, chunking, embedding, retrieval, and response generation.
FAISS: For high-performance vector search.
OpenAI GPT-4: As the response engine.
Streamlit: For a simple, intuitive user interface.
PyMuPDF: To accurately extract text from PDFs.
dotenv / Streamlit secrets: To secure and manage API keys during development and deployment.

🛠️ Building the Pipeline: From PDF to Conversational Agent

Once a user uploads a PDF, the assistant begins by extracting the raw text using PyMuPDF. That text is then broken down into coherent chunks using LangChain’s RecursiveCharacterTextSplitter, which ensures semantic meaning is preserved across each segment.

Each chunk is passed through OpenAI’s Embedding API and transformed into a vector — a numerical representation of its semantic meaning. These vectors are stored in a FAISS index, enabling lightning-fast similarity search.

When a user submits a question, the system performs a vector similarity search to identify the most relevant chunks from the document. These retrieved segments, along with the question, are then passed into a LangChain QA chain, which wraps GPT-4 with a prompt that encourages grounded, referenceable answers.

Finally, the result is displayed in the Streamlit interface — complete with citations pointing to the exact document snippets used for the answer.

📚 Use Cases That Inspired Me

To test the assistant, I uploaded the foundational NLP paper “Attention Is All You Need”. I asked it questions like:

“What is the purpose of positional encoding in transformers?”
“How does the model handle long-range dependencies?”

The assistant answered clearly, referencing the appropriate section of the paper. That was the moment I realized the power of this tool.

Here are other real-world scenarios I envision:

Students uploading textbook chapters to quiz themselves before exams.
Researchers reviewing academic papers for specific methods or results.
Analysts exploring long technical or financial documents more efficiently.
Developers building their own domain-specific document assistants on top of this architecture.

🔍 Challenges and Lessons Learned

One of the early challenges I faced was tuning the chunk size and overlap. Small chunks lacked context; large chunks diluted relevance. After experimentation, I found a balanced configuration that preserved semantic integrity while optimizing retrieval quality.

Another key lesson was handling out-of-scope questions. Initially, GPT-4 would hallucinate confidently if no relevant content was found. I addressed this by refining the system prompt to instruct the model to admit uncertainty if no answer could be found — drastically improving trust.

I also had to ensure API security. During development, .env files managed local secrets, while in production, I used Streamlit Secrets Manager, keeping credentials hidden and the app secure.

🧠 What Makes This Assistant Different?

Unlike generic AI chat tools, this assistant is context-aware and document-grounded. It doesn’t just generate an answer — it shows exactly where the answer came from. This is critical for knowledge workers, researchers, and students who demand transparency and trust.

It’s also accessible. There’s no setup or configuration required — just upload your PDF and start asking questions.

✅ Final Results

The assistant performed exceptionally well on technical, academic, and business PDFs. It returned coherent, relevant answers to in-scope questions and gracefully handled queries that weren’t covered in the text.

This project served as my capstone for Module 1 of the Agentic AI Developer Certification Program, and it reinforced my belief that LLMs become exponentially more useful when they are grounded in user-specific content.

📁 Source Code

You can explore the full source code and architecture on GitHub:
👉 https://github.com/Chizzy0428/RAG-Assistant-Project/tree/main

✨ What’s Next?

I’m currently exploring the following enhancements:

Support for multiple document uploads
Merging PDFs and URLs into a unified knowledge base
Voice-to-text queries for accessibility
Local-first mode for offline use cases

This is just the beginning.

👨🏽‍💻 About the Author

Nwabuisi Chizaram is a Data Scientist, Educator, and AI Developer passionate about building tools that make intelligence accessible. His work focuses on NLP, agentic systems, and real-world AI-powered applications for education, productivity, and research.

🔗 Connect on LinkedIn

🛡️ Security Note

API keys are stored securely using .env in development and Streamlit Secrets in production. No secrets are ever exposed in the codebase.

“We’re not just building apps that answer questions — we’re building assistants that understand your world.”

Source Code
Explore the source code and architecture here:
https://github.com/Chizzy0428/RAG-Assistant-Project/tree/main