Mama Amaka: The RAG Application for Native Nigerian Dishes
Abstract
Mama Amaka is an AI-powered Retrieval-Augmented Generation (RAG) assistant that provides contextual, accurate answers about Nigerian cuisine.
Built with LangChain, ChromaDB, and multiple LLM providers (OpenAI, Groq, Google Gemini), the application combines vector-based semantic search with large language models to deliver personalized cooking guidance.
The system indexes traditional Nigerian recipes, chunks them for efficient retrieval, and generates warm, culturally-informed responses through a friendly "Mama Amaka" persona.
This tool addresses the gap in accessible, AI-driven resources for learning traditional Nigerian cooking methods and serves as both a practical culinary assistant and an educational reference for RAG architecture implementation.
Nigerian cuisine represents one of Africa's richest culinary traditions, featuring diverse dishes like Jollof rice, Egusi soup, and Suya that have gained international recognition. However, accessing authentic, detailed information about traditional cooking methods remains challenging for many enthusiasts. Existing recipe resources often lack the conversational, contextual guidance that home cooks need when preparing unfamiliar dishes.
Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm for building knowledge-intensive AI applications. By combining the precision of information retrieval with the generative capabilities of large language models, RAG systems can provide accurate, contextual responses grounded in specific knowledge bases which make them ideal for domain-specific applications like culinary assistance.
Mama Amaka bridges this gap by creating an intelligent recipe assistant that understands and responds to natural language queries about Nigerian food. Rather than simply returning search results, the system synthesizes information from its recipe knowledge base to provide comprehensive, conversational answers that are complete with a warm, motherly personality that reflects the cultural tradition of learning cooking from family elders.
This publication presents the technical architecture, implementation methodology, and practical considerations for building and deploying Mama Amaka, serving both as documentation for users and as an educational resource for developers interested in RAG system development.
Methodology
RAG Architecture Overview
Mama Amaka implements a standard RAG pipeline that processes user queries through four distinct stages:
The system ingests recipe documents through a multi-step processing pipeline:
1. Document Loading: Text files containing Nigerian recipes are loaded from the data/ directory. Each recipe includes the dish name, ingredients list, cooking instructions, and optional serving suggestions.
2. Text Chunking: Documents are split using LangChain's RecursiveCharacterTextSplitter with the following parameters:
Chunk size: 500 characters
Chunk overlap: 50 characters
Splitting hierarchy: paragraphs → sentences → words
This hierarchical approach preserves semantic coherence while creating manageable chunks for embedding and retrieval.
3. Embedding Generation: Each chunk is converted to a 384-dimensional vector representation using the sentence-transformers/all-MiniLM-L6-v2 model. This lightweight model provides strong semantic similarity performance while maintaining reasonable computational requirements.
4. Vector Storage: Embeddings are stored in ChromaDB, an open-source vector database that supports persistent storage and efficient similarity search using cosine distance.
# The VectorDB class handles all vector storage and retrieval operations:from src.vectordb import VectorDB
# Initialize vector databasevdb = VectorDB(collection_name="mama_amaka_recipes")# Search for relevant recipe contentresults = vdb.search("jollof rice", n_results=3)print(f"Found {len(results['documents'])} relevant chunks")
Retrieval Mechanism
When a user submits a query, the system:
Embeds the query using the same sentence transformer model
Performs cosine similarity search against the vector database
Retrieves the top-K most relevant chunks (default K=3)
Formats retrieved chunks with source attribution for context injection
defask(self, query:str, n_results:int=3)->str:"""Process user query and generate contextual response."""# Retrieve relevant context from vector database search_results = self.vector_db.search(query, n_results=n_results)# Combine retrieved chunks into context context = self._format_context(search_results)# Generate response using LLM with retrieved context response = self.chain.invoke({"context": context,"question": query
})return response
Response Generation
The assembled context is combined with the user query in a structured prompt template that defines the "Mama Amaka" persona which is a warm, knowledgeable Nigerian cooking expert.
The system supports multiple LLM providers (OpenAI GPT-4o-mini, Groq Llama-3.1, Google Gemini) through LangChain's unified interface, allowing users to choose based on performance requirements and cost considerations.
Experiments
System Configuration
The development and testing environment consisted of:
Hardware: CPU-based inference (GPU optional for embedding generation)
Memory: 4GB minimum, 8GB recommended
# .env configuration file# Choose ONE of the following API keys:# Option 1: OpenAI (Recommended for best quality)OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-4o-mini
# Option 2: Groq (Fast and free tier available)GROQ_API_KEY=your_groq_api_key_here
GROQ_MODEL=llama-3.1-8b-instant
# Option 3: Google GeminiGOOGLE_API_KEY=your_google_api_key_here
GOOGLE_MODEL=gemini-2.0-flash
# Optional: Custom ChromaDB collection nameCHROMA_COLLECTION_NAME=mama_amaka_recipes
# Optional: Custom embedding modelEMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
Knowledge Base
The initial knowledge base includes 6 traditional Nigerian recipes:
Jollof Rice
Egusi Soup
Coconut Rice
Moi-Moi
Yam and Egg Sauce
Pepper Soup
Each recipe document contains comprehensive information including regional variations, ingredient substitutions, and cooking tips.
Jollof rice is a popular party favourite in Nigeria...
INGREDIENTS
Serves 4
- 500g Long grain rice
- 3 cooking spoons Margarine/Vegetable oil
- 400g Tomato paste
- 2 Onions (chopped)
- 3 Scotch bonnet peppers
...
METHOD
STEP 1 Melt the butter in a large pot...
STEP 2 Add the rice and stir to coat...
...
# Full RAG Pipeline:i# Initialize and prepare agentagent = MamaAmakaAgent()agent.ingest_data()# Ask a questionanswer = agent.ask("What is jollof rice?")assertlen(answer)>0print(answer)
Test Queries
The system was validated against a diverse set of query types:
Query Type - Example - Expected Behavior
Direct recipe request - "How do I make jollof rice?" - Return complete cooking instructions
Ingredient inquiry - "What ingredients are in egusi soup?" - List all required ingredients
Technique question - "How long does moi-moi take to steam?" - Provide specific timing information
General knowledge - "Tell me about coconut rice" - Offer overview with cultural context
Out-of-scope - "How do I make sushi?" - Acknowledge limitation gracefully
Results
Retrieval Performance
The vector search component demonstrated strong performance characteristics:
Qualitative evaluation of system responses revealed:
Strengths:
Accurate ingredient lists and proportions
Clear, sequential cooking instructions
Appropriate handling of out-of-scope queries
Consistent persona maintenance across interactions
Areas for Improvement:
Limited coverage (6 recipes in initial knowledge base)
No support for follow-up questions or conversation memory
Text-only responses (no images or videos)
ChromaDB's in-memory search scales efficiently up to approximately 1 million vectors, making it suitable for recipe collections of significant size.
Conclusion
Mama Amaka demonstrates the practical application of RAG architecture for creating domain-specific AI assistants. By combining semantic search with large language models, the system provides accurate, contextual responses about Nigerian cuisine while maintaining a culturally appropriate conversational style.
Key Contributions
Practical RAG Implementation: A complete, working example of RAG architecture using modern tools (LangChain, ChromaDB, Sentence Transformers)
Multi-Provider Flexibility: Support for OpenAI, Groq, and Google Gemini allows optimization for cost, speed, or quality
Cultural Preservation: Digital documentation and accessibility of traditional Nigerian cooking knowledge
Educational Resource: Well-documented codebase serves as a learning reference for RAG system development
Limitations
Knowledge Base Scope: Currently limited to 6 recipes; expansion requires manual document creation
Single-Turn Interactions: No conversation memory or context carryover between queries
Text-Only Interface: CLI-based interaction without visual elements
Language Support: English only; no support for Nigerian languages (Yoruba, Igbo, Hausa)
Future Directions
Potential enhancements include:
Expanded Recipe Database: Integration with recipe APIs or web scraping for broader coverage
Multimodal Support: Adding recipe images and video tutorials
Web Interface: Streamlit or Gradio-based UI for improved accessibility
Conversation Memory: Implementing chat history for multi-turn interactions
Voice Interface: Speech-to-text input for hands-free cooking assistance
Multilingual Support: Nigerian language translations for broader accessibility
Availability
Mama Amaka is open-source under the MIT License. The complete codebase, documentation, and sample recipes are available at:
Lewis, P., et al. (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." Advances in Neural Information Processing Systems, 33, 9459-9474.