RAG-Powered AI Assistant Documentation

Project scope

This documentation provides a comprehensive overview of the RAG-Powered AI Assistant repository. The project demonstrates how to build a Retrieval-Augmented Generation (RAG) assistant capable of answering questions using a custom knowledge base of publications. The assistant supports both command-line and web user interfaces, leveraging Mistral and Hugging Face models for retrieval and generation.

Project Purpose

The primary goal of this repository is to showcase the development of an AI assistant that combines retrieval-based search with generative AI to provide contextually relevant answers from a curated set of publication data. This approach addresses the challenge of extracting actionable insights from large, domain-specific datasets, making it ideal for research, academic, or organizational knowledge bases

How the RAG-Powered-AI-Assistant Library Works: System Components and Pipeline

The RAG-Powered-AI-Assistant library is architected around modular components for scalable, efficient retrieval-augmented generation (RAG). Its design centers on two foundational classes—TextChunker and VectorDB—which together enable robust document preprocessing, vector storage, and semantic search. Here’s a detailed breakdown of how the library works, referencing the system components documentation and the overall pipeline.

TextChunker: Document Chunking

The TextChunker class is responsible for transforming raw publication documents into smaller, manageable chunks suitable for embedding and retrieval.

Key Implementation Details:

Configurable Chunking:
- You can specify the chunk size (number of characters per chunk) and overlap (number of characters shared between chunks). This flexibility allows tuning for optimal LLM context window usage.
Flexible Separators:
- The chunking process uses a prioritized list of separators (e.g., paragraph breaks, newlines, sentences, spaces) to split text in a way that preserves semantic coherence.
Metadata Preservation:
- Each chunk retains essential metadata, such as the document title and a unique chunk identifier, which is crucial for traceability and context during retrieval.
Batch Processing:
- The process_documents method handles a list of documents, chunking each and aggregating the results.
Document Loading:
- The load_documents method loads documents from a directory of JSON files, extracting fields like title and publication_description.

Typical Workflow:

Load Documents:
- Use TextChunker.load_documents to read and parse JSON files, extracting titles and content.
Chunk Documents:
- Use TextChunker.process_documents to split each document into chunks, each with its own metadata.

VectorDB: Persistent Vector Database

The VectorDB class manages persistent storage and retrieval of text embeddings using ChromaDB and Hugging Face models.

Key Implementation Details:

Persistent Storage:
- Utilizes ChromaDB to store embeddings, text chunks, and metadata on disk, ensuring data durability and fast access.
Flexible Embeddings:
- Employs a Hugging Face transformer model for embedding generation, with automatic device selection (CUDA, MPS, or CPU) for optimal performance.
Document Insertion:
- The insert_documents method embeds and stores chunked documents, assigning unique IDs and preserving metadata.
Semantic Search:
- The search method retrieves the most relevant chunks for a given query using vector similarity, returning content, metadata, and similarity scores.

Typical Workflow:

Initialize Database:
- Specify storage path, collection name, and embedding model.
Insert Documents:
- Use VectorDB.insert_documents to embed and store the chunked documents from TextChunker.
Semantic Search:
- Use VectorDB.search to retrieve the top-k most similar chunks for a user query.

Setup and Installation

Clone the Repository

git clone https://github.com/HiIAmTzeKean/RAG-Powered-AI-Assistant
cd RAG-Powered-AI-Assistant

Prepare the Knowledge Base

Download the publication dataset from Ready Tensor.
Place the dataset file(s) in the /data directory.

Install Dependencies

uv sync

Set Environment Variables

# Create a .env file in the project root with your API keys:
MISTRAL_API_KEY=your_mistral_api_key
HF_KEY=your_huggingface_api_key

Usage

Sample CLI Usage:

uv run cli "What are effective techniques for handling class imbalance?"

Sample Web UI Usage:

uv run ui

Customisation and Extension

Knowledge Base: You can replace or expand the dataset in /data with your own JSON or text files, provided they follow the expected schema.
Model Providers: The assistant is designed to work model agnostic. If you need to change out a model, you can do so by adding another sub-class in the agent module.
Interface: Both CLI and web UI can be extended or customized for specific workflows.

Requirements

Python 3.8+
uv (for dependency and script management)
Mistral and Hugging Face API keys
Publication dataset (to be placed in /data)

Summary

The RAG-Powered AI Assistant repository provides a practical, extensible template for building RAG-based assistants over custom datasets. Its modular design, support for multiple interfaces, and API flexibility make it suitable for a wide range of knowledge-driven applications.

For more details, see the repository's README and source code.

Ragssistant: RAG-Powered-AI-Assistant

Ragssistant: RAG-Powered-AI-Assistant

Table of contents

RAG-Powered AI Assistant Documentation

Project scope

Project Purpose

How the RAG-Powered-AI-Assistant Library Works: System Components and Pipeline

TextChunker: Document Chunking

Typical Workflow:

VectorDB: Persistent Vector Database

Typical Workflow:

Setup and Installation

Usage

Customisation and Extension

Requirements

Summary

Table of contents

Code

Code

Datasets

Datasets