
Figure 1: Heritage's mask icon.

Illustration: Heritage live feed.
Heritage is an agentic Retrieval-Augmented Generation (RAG) system designed to deliver precise, context-aware answers about Nigerian peoples, their pre-historic arts, and cultural practices. Built with LangGraph, Chroma, and Tavily, it integrates a vectorized knowledge base with real-time web search capabilities. The system leverages state-of-the-art (SOTA) models with tool-calling abilities, accessed via OpenRouter for cost efficiency, though it is not a free service. It features a modular architecture for domain adaptability and a Streamlit interface for user interaction. Evaluations show an 80% accuracy rate on heritage-related queries, with web search latency (P99 at 52.21s) identified as an area for optimization. Developed for the Agentic AI Developer Certification Program 2025, Heritage promotes cultural appreciation through AI-driven exploration.
Nigeria’s cultural heritage, encompassing ancient art forms like Nok, Igbo-Ukwu, Ife, and Benin, as well as diverse ethnic traditions, is a rich yet underexplored domain in digital education. Heritage addresses this by providing an AI-powered query system that enables users to explore Nigerian arts and culture through natural language interactions. Developed by 🅱🅻🅰🆀, this project leverages Retrieval-Augmented Generation (RAG) to combine a curated knowledge base with real-time web data, ensuring historical depth and contemporary relevance. The system uses state-of-the-art (SOTA) models with tool-calling capabilities, accessed cost-effectively via OpenRouter, though it is not free to operate due to API usage costs. It is deployed at Heritage on Streamlit Cloud and open-sourced under the MIT License at Github repo.
The objectives are:
This publication details the system’s methodology, experimental setup, results, and future improvements, fulfilling the Week 1 deliverables of the Agentic AI Developer Certification Program.
Heritage employs a modular Retrieval-Augmented Generation (RAG) architecture, leveraging LangGraph for stateful conversation management and tool orchestration. The system integrates a Chroma vector store, populated with PDF documents from the DATA_PATH directory, for historical and cultural data retrieval. An ensemble retriever, combining a similarity-based retriever (k=5) and an SVM retriever with weights [0.7, 0.3], ensures robust document retrieval. For queries requiring current information, the Tavily search tool fetches real-time web data with advanced search depth. The system uses OpenRouter to access cost-efficient SOTA models (meta-llama/llama-4-scout for primary responses and openai/gpt-4.1-nano for conversation summarization). The Streamlit interface enhances user engagement with a culturally themed UI, featuring dynamic fun facts, a query history table, and interactive charts for query metrics.
The pipeline comprises:
from langchain_chroma import Chroma
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.retrievers import SVMRetriever
from langchain.retrievers import EnsembleRetriever
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
loader = DirectoryLoader(data_path, glob="**/*.pdf", loader_cls=PyMuPDFLoader)
docs = loader.load()
vector_store = Chroma.from_documents(docs, embeddings)
similarity_retriever = vector_store.as_retriever(search_kwargs={"k": 5})
svm_retriever = SVMRetriever.from_documents(docs, embeddings)
ensemble_retriever = EnsembleRetriever(retrievers=[similarity_retriever, svm_retriever], weights=[0.7, 0.3])

Figure 2: Heritage's Streamlit interface, showcasing the Statistics page.
The knowledge base consists of PDFs on Nigerian arts and culture in ./data/, covering Nok, Igbo-Ukwu, Ife, Benin, and Esie art forms, cultural practices, and historical figures. Due to licensing, the dataset is not publicly shared but can be extended with compliant sources.
The system was tested with heritage_eval_dataset.csv, containing five queries:
Outputs were compared against expected answers using an LLM-based evaluator openai/gpt-4.1-nano, with results saved to heritage_eval_results.csv.
meta-llama/llama-4-scout for generation and openai/gpt-4.1-nano for evaluation, accessed via OpenRouter for cost efficiency. Note that Heritage is not free to operate due to these API costs.Latency was monitored over the last 7 days, capturing P90 (0.229s) and P99 (52.21s) latencies. High-latency outliers were traced to Tavily web searches, with one run peaking at 14.21s.

Figure 3: Langsmith dashboard, showing P99 latency at 52.21s due to Tavily web searches.
True.True.True.False.True.4/5 (80%). The failure on Test 4 indicates gaps in the knowledge base or retrieval logic for historical dating.

Table 1: Summary of evaluation results.
Monitoring revealed a P90 latency of 0.229s and a P99 latency of 52.21s over the last 7 days. High-latency outliers, such as a 14.21s run, were attributed to Tavily web searches, likely due to network delays or API response times. This suggests that while most queries are processed efficiently, web-dependent queries can significantly impact performance.
Heritage effectively provides an AI-driven platform for exploring Nigerian arts and culture, achieving 80% accuracy in evaluations. Its modular RAG architecture, combining Chroma-based retrieval with Tavily web search, ensures flexibility and relevance. However, challenges remain:
Future improvements include:
Heritage represents a meaningful step toward preserving Nigeria’s cultural legacy through AI, with potential for adaptation to other domains.
The project is accessible at Heritage on Streamlit and its source code at Github repo.
