Blender Bot — A RAG Assistant for Blender 3D

Blender Bot — A RAG Assistant for the Blender 3D Manual

Project type: Agentic AI / Retrieval‑Augmented Generation
Domain: Technical documentation QA (Blender 3D)
Repository: https://github.com/mor-dar/blender-rag-assistant (GitHub)
License: CC‑BY‑SA 4.0 (project code and incorporated manual excerpts) (GitHub, docs.blender.org)

Abstract

Blender Bot is a retrieval‑augmented assistant for the Blender 3D Manual that answers “how‑to” questions with grounded explanations and clickable citations back to the official docs. The system ingests and cleans the manual’s HTML, semantically chunks the content, and indexes it in ChromaDB using sentence‑transformer embeddings. At query time it performs top‑K similarity search and prompts an LLM (default Llama‑3.1‑8B‑Instant via Groq, with OpenAI as an option) to compose faithful answers strictly from the retrieved context. The assistant is available via a CLI and a Streamlit web UI, and can be launched with a Docker one‑liner without local setup. A small, reproducible mini‑evaluation reports retrieval recall, MRR, and latency on a demo subset of the manual (see the Evaluation section and script). The project preserves CC‑BY‑SA 4.0 attribution for the Blender Manual and surfaces sources in every answer.

Overview

Blender Bot is a retrieval‑augmented assistant that answers questions about Blender using only the official Blender Manual as its knowledge source. The system parses and cleans the manual’s HTML, semantically chunks the content, indexes it in ChromaDB, and serves grounded, cited answers via a CLI or a Streamlit web UI. It supports Groq (default: Llama‑3.1‑8B‑Instant) or OpenAI LLMs, with behavior configured through environment variables. (GitHub)

Why this matters. Tool and version‑specific “how‑to” queries often trigger hallucinations in generic chatbots. By constraining generation strictly to retrieved, cited passages from the Blender Manual—licensed under CC‑BY‑SA 4.0—the assistant improves precision while respecting attribution. (docs.blender.org)

What I Built

A full RAG pipeline tailored to the Blender Manual: HTML ingestion → semantic chunking → vector indexing (ChromaDB) → similarity retrieval → context‑only generation with numbered citations.
Two interfaces (CLI and Streamlit) with quality‑of‑life features: built‑in sample questions, a random question generator, response‑time display, and a system/health sidebar.
A configuration layer (via .env) covering model selection (Groq/OpenAI), embeddings, chunking, retrieval, caching, and optional conversation memory (window/summary). (GitHub)

Methods

1) Data Preparation

Acquire documentation: Pull the Blender Manual and prepare a local HTML corpus. (The manual explicitly states CC‑BY‑SA 4.0 licensing.) (docs.blender.org)
Parse & clean: Strip boilerplate HTML, normalize whitespace/Unicode artifacts, and preserve code/shortcut snippets.
Semantic chunking: Split by headings/sections with guardrails (min/max sizes, overlap) to maintain procedural steps and references.
Metadata: For each chunk, keep title, section path, URL, and token counts to support traceable citations and ranking.

Figure 1: Data preparation workflow

Implementation hints & structure appear in the repo’s README (e.g., “Document Processing,” “Vector Builder,” and semantic chunking notes). (GitHub)

2) Indexing & Embeddings

Vector DB: ChromaDB with on‑disk persistence (CHROMA_PERSIST_DIRECTORY, CHROMA_COLLECTION_NAME).
Embeddings: HuggingFace sentence‑transformers, default model multi-qa-MiniLM-L6-cos-v1, batchable on CPU/CUDA.
Controls: Chunk size/overlap and semantic chunking toggles (CHUNK_SIZE, CHUNK_OVERLAP, USE_SEMANTIC_CHUNKING, etc.). (GitHub)

3) Retrieval & Ranking

Similarity search: top‑K retrieval with optional similarity thresholding and adjacent‑context windowing to stitch related chunks.
Health checks: retrieval diagnostics and robust error handling to avoid empty contexts. (GitHub)

4) Answer Generation & UX

Prompting: A context‑only instruction with numbered citations [1], [2], … and a clean “Sources” list with working Blender‑manual URLs.
Models: Groq (default Llama‑3.1‑8B‑Instant) or OpenAI (temperature and max tokens configurable for both through environment variables).
Interfaces:
- CLI: python main.py (or --cli).
- Streamlit: python main.py --web → open http://localhost:8501. (GitHub)

Figure 2: Query handling workflow

Evaluation & Results

Dataset & Protocol

Demo subset: Blender 4.5 UI Controls (Buttons, Decorators, Eyedropper, Fields, Menus).
10 hand-written questions (2 per page), each mapped to a gold URL.
Retriever: ChromaDB + multi-qa-MiniLM-L6-cos-v1, k=5.
Latency reported for retrieval step only (CPU‑only on laptop; LLM time excluded).

Reproduce Exactly

python scripts/evaluate_demo.py --autobuild-demo \
  --persist_dir ./data/vector_db --collection blender_docs --k 5
# Artifacts: outputs/eval_demo.md and outputs/eval_demo.json

Metrics

Metric	Value
Queries	10
k	5
Recall@1	1.0
Recall@3	1.0
MRR	1.0
Median latency (s)	0.012

Full per-query table: include outputs/eval_demo.md as a file attachment, and optionally link to outputs/eval_demo.json for raw metrics.

Reproducibility

A) Quick Start (recommended)

# Single line to run the web demo (includes data download & DB build)
docker run -p 8501:8501 -e GROQ_API_KEY=$GROQ_API_KEY \
  -v "$(pwd)/data:/app/data" mdar/blender-rag-assistant:v1.0.5 evaluate-web

# For the full manual - a 2 step process is recommended:
# 1) Build full knowledge base with pre-built image - results persist to disk
docker run -v "$(pwd)/data:/app/data" mdar/blender-rag-assistant:v1.0.5 build-full

# 2) Run web UI against existing full DB
docker run -p 8501:8501 -v "$(pwd)/data:/app/data" mdar/blender-rag-assistant:v1.0.5 run-web

B) Local (no Docker)

# 1) Install
pip install -r requirements.txt

# 2) Configure at least one provider
# Groq (default)
export GROQ_API_KEY=...
# or OpenAI
export OPENAI_API_KEY=...

# 3) Run CLI or Web
python main.py          # CLI (default)
python main.py --web    # Streamlit @ http://localhost:8501

Key environment variables (examples):
GROQ_MODEL=llama-3.1-8b-instant · OPENAI_MODEL=gpt-4 · EMBEDDING_MODEL=multi-qa-MiniLM-L6-cos-v1 · CHUNK_SIZE=512 · CHUNK_OVERLAP=50 · USE_SEMANTIC_CHUNKING=true · RETRIEVAL_K=5 · MEMORY_TYPE=none|window|summary. See .env.example in the repo for the full set. (GitHub)

C) Local Docker (build & run)

# Build image
docker build -t blender-rag-assistant .

# Run CLI
docker run --rm -it \
  -e GROQ_API_KEY=$GROQ_API_KEY \
  blender-rag-assistant python main.py

# Run Web UI
docker run --rm -it -p 8501:8501 \
  -e GROQ_API_KEY=$GROQ_API_KEY \
  blender-rag-assistant python main.py --web

A Dockerfile and docker-entrypoint.sh are included in the repository. (GitHub)

Ethics

Attribution & licensing. The Blender Manual is licensed CC‑BY‑SA 4.0; the project preserves attribution (“Blender Documentation Team”) and includes links back to the original manual. The repository also adopts CC‑BY‑SA 4.0 to remain license‑compatible. (docs.blender.org, GitHub)
Privacy. No user data is persisted beyond transient session/UI state.
Transparency. All responses enumerate sources and URLs to enable verification.

(This section’s structure and emphasis follows Ready Tensor’s recommended publication practices and rubric.) (app.readytensor.ai)

Limitations

Version drift. The index reflects the manual snapshot at build time. New Blender releases or documentation updates can alter tool names or UI flows; re‑ingest and rebuild to stay current. (docs.blender.org)
Model variance. Different LLMs/temperatures may vary in phrasing; strict prompting and citations mitigate hallucinations but don’t eliminate them.
Automated scoring. No automatic exact‑match metric is provided; evaluation relies on human judgment for correctness and grounding.

References

Repository: mor‑dar/blender‑rag‑assistant — README (architecture, env vars, interfaces, licensing). (GitHub)
Blender Manual License: “License — Blender 4.5 LTS Manual.” (CC‑BY‑SA 4.0). (docs.blender.org)
Blender Manual (current): “Blender 4.5 LTS Manual.” (General reference). (docs.blender.org)
Ready Tensor guidance: Submission guidelines, rubric, and best practices. (app.readytensor.ai)

Appendix (Quick Commands)

# Example: pick OpenAI instead of Groq
export OPENAI_API_KEY=...
export OPENAI_MODEL=gpt-4
python main.py --web

# Tune retrieval
export RETRIEVAL_K=8
export SIMILARITY_THRESHOLD=0.65
python main.py --cli

(Variables per the repo’s documented configuration matrix.) (GitHub)