RAG Assistant – Intelligent Knowledge Assistant

Abstract

The project “RAG Assistant – Intelligent Knowledge Assistant” was developed as the final assignment for the Ready Tensor course.
Its goal was to build an application capable of answering user questions based on provided text or PDF documents using the Retrieval-Augmented Generation (RAG) approach.

The application combines two powerful technologies: semantic search (to retrieve the most relevant information) and generative artificial intelligence (to produce natural language answers).
By integrating these two layers, the system can generate responses exclusively from the supplied documents, without relying on external knowledge.

The project was designed to be modular, extensible, and provider-agnostic, supporting multiple Large Language Model (LLM) APIs, including OpenAI (GPT-4), Groq (LLaMA), and Google Gemini.

Current State Gap Identification

Before developing the RAG Assistant, most conversational AI tools operated solely on pre-trained model knowledge.
While they could generate fluent and coherent responses, these systems frequently faced issues related to accuracy, traceability, and context alignment, particularly when users required answers grounded in private or domain-specific documents.

Existing solutions lacked:
A straightforward mechanism to link external user-provided documents (PDF or text) to the reasoning process of large language models,
Transparent and verifiable answers showing which document sources were used,
Cross-platform flexibility to switch between AI providers such as OpenAI, Groq, or Google Gemini without major code changes.

The RAG Assistant was designed to bridge this gap by introducing a modular, transparent, and controllable RAG pipeline.
It retrieves relevant content from user documents, uses that context to generate precise answers, and ensures all responses are fully grounded in verifiable data rather than generic model knowledge.

Dataset Description

The dataset used in this project is user-generated, meaning that the system does not rely on predefined or external corpora.

Instead, the user provides their own collection of documents in the data/ directory, which can include:
Plain text files (.txt)
PDF documents (.pdf)

Each document is automatically processed and converted into a structured format containing:
Content – the full extracted text of the document
Metadata – such as the document title (from filename) and file type

This flexible design allows the assistant to adapt to a wide variety of use cases, for example:
Internal company knowledge bases
Research papers or academic notes
Product manuals or technical documentation
Personal notes or educational resources

All document processing happens locally, ensuring data privacy and user control over the knowledge base.
The dataset can be updated dynamically — new files can be added to the data/ folder without modifying the codebase.

Methodology

The application was implemented in Python and organized into several key components:

1. Document Loader (load_documents)

Users can place their .txt or .pdf files in the data folder.
The program automatically reads these files, extracts their content, and structures them into a standardized format containing text and metadata (title and file type).

2. Vector Database (VectorDB)

Each document is converted into semantic vectors — numerical representations of meaning.
This enables semantic search, allowing the assistant to find relevant passages even when the user’s query uses different wording than the original text.

3. Language Model Initialization

The system automatically selects an available model based on API keys stored in the .env file.
It supports three main AI providers:

OpenAI GPT-4o-mini
Groq LLaMA 3.1
Google Gemini 2.0

These models generate human-like answers grounded in the retrieved document fragments.

4. RAG Mechanism (Retrieval-Augmented Generation)

When the user asks a question, the assistant performs two main steps:
Retrieval – searches the vector database for the most relevant document chunks.
Generation – passes those chunks to the selected LLM, which creates a final answer.
The assistant is explicitly instructed not to answer questions unrelated to the available documents, ensuring reliability and factual accuracy.

5. User Interface (Command Line)

Users interact with the assistant through a simple console interface.
They can type questions in natural language and receive clear, structured responses in Markdown format, often with bullet points for readability.

Results

The developed project enables anyone to create a personal knowledge assistant trained on their own text data.

Testing confirmed that the assistant can:
Retrieve the most relevant information from various text sources,
Generate concise and accurate summaries,
Maintain strict alignment with the original content.

Because of its modular design, the system can be easily extended — for example, by adding new data sources, embedding methods, or integrating additional LLM providers.

In summary, the RAG Assistant demonstrates a practical and transparent implementation of the Retrieval-Augmented Generation approach.
It shows how modern AI technologies can be used to safely and effectively interact with private knowledge bases, minimizing hallucinations and ensuring that every answer is grounded in actual documents.