Despite the public availability of Ethiopia’s historical records and academic resources, their complexity and academic formality make them difficult for students and the general public to understand and apply in real-life learning. This project presents a conversational Ethiopian History RAG (Retrieval-Augmented Generation) Assistant that simplifies and contextualizes Ethiopian history through semantic search, reference-cited reasoning, and user-friendly interaction. Powered by LangChain, ChromaDB, HuggingFace embeddings, Groq-hosted LLaMA 3, and Streamlit, the assistant explains historical events in plain language, interprets user questions, and offers grounded insights—bridging the gap between historical text and practical understanding.
Understanding Ethiopia’s rich and complex history requires more than a simple search query. Users often seek context—why events happened, their cultural impact, and their global significance. Traditional databases return raw text but fail to interpret meaning, while generic AI models risk hallucination, misrepresenting historical facts and eroding trust in educational settings.
This project bridges that gap with a context-sensitive history assistant powered by Retrieval-Augmented Generation (RAG). By combining semantic search, structured reasoning, and robust prompt engineering, it provides grounded, reference-cited responses to user queries. Whether you’re exploring the origins of the Ethiopian Empire, understanding the impact of coffee on global trade, or analyzing the Derg regime, this assistant delivers accurate, explainable insights—transforming passive access into active learning.
While Ethiopia’s historical documents and Wikipedia entries are public and searchable, applying them meaningfully to specific real-world questions remains complex. For example, understanding the causes and consequences of the Battle of Adwa or the significance of the Derg regime requires cross-referencing multiple sources and synthesizing context. Few tools exist today that connect a user’s curiosity—about people, events, or cultural heritage—to a grounded, accessible historical interpretation.
Traditional databases provide raw text search but lack the ability to process full context or offer structured reasoning. Meanwhile, generic AI models may hallucinate answers or misrepresent historical facts, undermining user trust in educational applications. There’s a gap between historical literacy and historical applicability—especially when users seek not just what happened, but why it matters.
This project addresses that gap by delivering a context-sensitive history assistant that reasons from Ethiopia’s past to real-world understanding. Whether asked to explain the origins of the Ethiopian Empire or to assess the impact of coffee on global trade, the assistant provides grounded, reference-cited responses. In doing so, it transforms passive access into proactive learning—where history begins to answer back.
This project combines retrieval-augmented generation (RAG), structured prompt engineering, and vector-based search to deliver grounded historical reasoning through natural-language interaction. The assistant is trained on curated Wikipedia articles, academic summaries, and primary sources, and is engineered to interpret user queries while maintaining historical accuracy and relevance.
The system is composed of four main layers:
The diagram below illustrates how the assistant is structured across four layers—data ingestion, retrieval, reasoning, and interaction:
The Ethiopian History RAG Assistant was tested locally on foundational use cases, confirming its ability to reason accurately across key areas of Ethiopian history. While formal deployment and benchmarking are still in progress, early results indicate strong alignment between user input and relevant historical responses.
Core capabilities were validated:
The Ethiopian History RAG Assistant shows strong promise as a foundational tool for education and public understanding. Planned enhancements include:
No. | Feature | Description |
---|---|---|
1 | Amharic and Multilingual Support | Make the assistant accessible to non-English speakers through Amharic and regional languages. |
2 | Primary Source and Archive Integration | Incorporate archival documents and oral histories for more dynamic and grounded responses. |
3 | Historical Event Simulation | Offer interactive walkthroughs and scenario explorers for students and educators. |
4 | Analytics and Feedback Loop | Use user feedback to refine retrieval, chunking logic, and prompt strategies for accuracy and relevance. |
5 | Deployment and Public Access | Launch a secure, hosted version suitable for schools, museums, and civic platforms. |
6 | Educational Mode and Timeline Explorer | Provide interactive timelines and learning modules to support students and lifelong learners. |
Prerequisites
Python 3.10+
Virtual environment recommended
Steps
git clone https://github.com/YonatanAwoke/Ethiopian_History_RAG
cd .\Ethiopian_History_RAG
pip install -r requirements.txt
Run the CLI
cd code
python .\vector_db_rag_retrieval.py
Run the Streamlit App
cd code
streamlit run ethiopian_history_streamlit_chat.py
Source Transparency: All responses cite their origin.
Hallucination Mitigation: Answers limited to retrieved context.
Ethical Usage: Designed for educational purposes; not a substitute for academic research.
Explore the full source code, CLI tools, UI, and model configuration on GitHub: