Nov 06, 2025●1 reads●MIT License

Bright RAG Assistant Module 1 ReadyTensor

b
@brightmachaya2

ABSTRACT

This project presents a Custom Retrieval-Augmented Generation (RAG) Assistant with a graphical interface that allows users to interact with their own text documents.
The system integrates Ollama local LLMs (e.g., Llama 2, Mistral) with Hugging Face embeddings (all-MiniLM-L6-v2) and FAISS for efficient semantic search.
All data and processing run locally, ensuring privacy and offline capability.
Users simply select a folder of .txt files, initialize the system, and ask natural-language questions about their content.

METHODOLOGY

The assistant follows a classic RAG pipeline with three main components:

Document Loading & Preprocessing
Local .txt files are read using a robust text loader, split into 1000-character chunks (200 overlap), and encoded with sentence-transformers/all-MiniLM-L6-v2 embeddings.

Retrieval Module
Embeddings are stored in a FAISS vector database for fast similarity search. The retriever returns the top-k most relevant chunks (default = 4).

Generation Module
Retrieved context and user query are passed to an Ollama model (e.g., llama2, mistral, phi) via LangChain’s RetrievalQA chain for grounded text generation.

User Interface
A simple Tkinter GUI allows users to select datasets, initialize the RAG system, and query the assistant interactively.

Models: Ollama LLMs
, Hugging Face Embeddings

RESULTS

Example Output:

User: “Summarize the main points in these documents.”
Assistant: “The texts discuss AI fundamentals, transformer architectures, and retrieval-augmented generation methods using local models.”

CONCLUSION

The system delivers accurate, grounded responses on local text datasets with low latency, demonstrating how LangChain + FAISS + Hugging Face + Ollama enable practical offline RAG workflows.

ABSTRACT

This project presents a Custom Retrieval-Augmented Generation (RAG) Assistant with a graphical interface that allows users to interact with their own text documents.
The system integrates Ollama local LLMs (e.g., Llama 2, Mistral) with Hugging Face embeddings (all-MiniLM-L6-v2) and FAISS for efficient semantic search.
All data and processing run locally, ensuring privacy and offline capability.
Users simply select a folder of .txt files, initialize the system, and ask natural-language questions about their content.

METHODOLOGY

The assistant follows a classic RAG pipeline with three main components:

Document Loading & Preprocessing
Local .txt files are read using a robust text loader, split into 1000-character chunks (200 overlap), and encoded with sentence-transformers/all-MiniLM-L6-v2 embeddings.

Retrieval Module
Embeddings are stored in a FAISS vector database for fast similarity search. The retriever returns the top-k most relevant chunks (default = 4).

Generation Module
Retrieved context and user query are passed to an Ollama model (e.g., llama2, mistral, phi) via LangChain’s RetrievalQA chain for grounded text generation.

User Interface
A simple Tkinter GUI allows users to select datasets, initialize the RAG system, and query the assistant interactively.

Models: Ollama LLMs
, Hugging Face Embeddings

RESULTS

Example Output:

User: “Summarize the main points in these documents.”
Assistant: “The texts discuss AI fundamentals, transformer architectures, and retrieval-augmented generation methods using local models.”

CONCLUSION

The system delivers accurate, grounded responses on local text datasets with low latency, demonstrating how LangChain + FAISS + Hugging Face + Ollama enable practical offline RAG workflows.

Bright RAG Assistant Module 1 ReadyTensor

Datasets

Datasets

Code

Code