This project implements a RAG (Retrieval-Augmented Generation) based AI Assistant using Streamlit as the front-end interface. The assistant retrieves relevant documents from a local knowledge base and generates responses using embeddings and similarity search. The implementation consists of two main components:
rag_demo.py โ handles document retrieval, embedding generation, and querying.
app.py โ provides the user interface where users can enter questions and receive AI-generated answers.
git clone https://github.com/your-username/rag-assistant.git cd rag-assistant
-pip install -r requirements.txt
-Run the Application
-streamlit run app.py
Open the Streamlit app in your browser (default: http://localhost:8501).
Enter a question in the text box.
Click Get Answer.
Traditional AI assistants rely only on pre-trained models and may hallucinate answers when external knowledge is required. To overcome this, Retrieval-Augmented Generation (RAG) integrates a retrieval step with generative models. Our project focuses on building a simple RAG pipeline with Streamlit to demonstrate how external knowledge can improve responses.
The main idea is:
Store documents in a vector database (embeddings).
Retrieve the top relevant documents for a given query.
Use those documents to generate the final answer.
The project follows these steps:
make_embedder(): Creates an embedding model to convert text into numerical vectors.
get_collection(): Prepares a vector collection to store and manage document embeddings.
answer_research_question(query):
Retrieves relevant documents.
Passes them to the model to generate an answer.
Returns the final response.
The app provides a clean UI with the title โRAG-based AI Assistantโ.
Users can input a question in a text box.
On clicking โGet Answerโ, the assistant retrieves and generates a response using the RAG pipeline.
A checkbox option allows users to view the top retrieved documents that contributed to the answer.
The current implementation is single-turn retrieval: each query is processed independently.
The retrieval process follows these steps:
Tested queries on the assistant by providing custom questions.
Verified that retrieved documents matched the context of the question.
Checked that answers improved when retrieved documents were relevant.
Compared answers with and without retrieved documents to validate the advantage of retrieval.
To validate the assistantโs performance, we tested queries against known documents.
top_k
to 5 balanced relevance and efficiency.To handle large documents, the text is split into smaller chunks of 500 tokens with an overlap of 100 tokens.
The RAG assistant successfully retrieved relevant supporting documents for user queries.
The generated answers were more accurate and context-aware compared to using only a generative model.
The Streamlit interface allowed real-time interaction, making it easy to test the RAG pipeline.
The developed application demonstrates how RAG enhances AI assistants by integrating external knowledge retrieval with generation. Our implementation shows that:
A simple embedding-based retriever improves response quality.
The Streamlit interface provides an interactive and user-friendly way to explore RAG.
Even with a minimal setup (rag_demo.py + app.py), a functional RAG system can be built and extended further.
While the current assistant demonstrates the core of RAG, several improvements are planned:
RAG, Retrieval-Augmented Generation, Vector Database, AI Assistant, Streamlit, Embeddings, Conversational AI, NLP