MusafirAI: RAG-Powered Travel Itinerary Generator
π§³ Project Overview
MusafirAI transforms travel planning through Retrieval-Augmented Generation (RAG) technology, creating personalized Pakistan itineraries by combining AI with curated travel knowledge. Built with LangChain and Cohere/Gemini AI, this system converts travel guides into intelligent recommendations with source attribution, delivering culturally rich travel plans.
π§ Core Functionality
MusafirAI solves the problem of generic travel planning by implementing:
- Destination Intelligence: Semantic search across curated Pakistan travel documents
- Personalized Itineraries: Day-by-day plans based on budget, interests, and group size
- Contextual Recommendations: Culturally appropriate suggestions with practical info
- Knowledge Expansion: Upload custom travel guides to enhance recommendations
- Source Attribution: Show sources used for each itinerary component
ποΈ Technical Architecture
System Architecture Diagram

System Components
1. Document Processing Engine(document_loader.py
):
- DirectoryLoader with text chunking (600 tokens, 100 overlap)
- Metadata enrichment for locations (city, province, type)
2. Vector Intelligence Core (vector_store_manager.py
):
- ChromaDB/FAISS vector stores with SentenceTransformer embeddings
- Automatic persistence and recreation of knowledge base
3. RAG Orchestrator (rag_system.py
):
- RetrievalQA chain with MMR search and score filtering
- Custom prompt engineering for structured itineraries
- Dual LLM support (Cohere Command/Gemini Pro)
4. Web Interface (MusafirAI.py
):
- Streamlit UI with PDF export and source inspection
- Dynamic form handling for travel preferences
- Vector store management controls
Technology Stack
- AI Framework: LangChain 0.0.340
- LLMs: Cohere Command, Google Gemini Pro
- Embeddings: all-MiniLM-L6-v2 Sentence Transformers
- Vector DB: ChromaDB 0.4.15 / FAISS 1.7.4
- Web Framework: Streamlit 1.28.1
- Dependencies: Python 3.11, ReportLab, BeautifulSoup
π Key Features & Innovations
Advanced RAG Implementation
- Destination-aware chunking preserving location context
- Hybrid metadata filtering (city/province/type) + semantic search
- Prompt engineering enforcing day-by-day structure with practical details
User Experience Excellence
- Interactive preference form with Pakistan-specific options
- Source document inspection with metadata
- One-click PDF/text itinerary downloads
- Real-time vector store management
Production-Ready Features
- API key management with .env and Streamlit Secrets
- Automatic directory creation for cloud environments
- Comprehensive error handling with user feedback
- Dual deployment support (local/cloud)
System Workflow

System Capabilities
- Processes 50+ page documents in under 60 seconds
- Generates 7-day itineraries with 5+ attractions/day
- Maintains context across multiple queries
- Accurately attributes sources with content previews
Validation Metrics
- Processes 50+ page documents in under 60 seconds
- Generates 7-day itineraries with 5+ attractions/day
- Maintains context across multiple queries
- Accurately attributes sources with content previews
π οΈ Implementation Details
Installation & Setup
git clone https://github.com/zshafique25/RAG_Assisstant.git
cd RAG_Assisstant
# Install dependencies
pip install -r requirements.txt
# Add API keys to .env
echo "COHERE_API_KEY=your_key" >> .env
echo "GEMINI_API_KEY=your_key" >> .env
# Launch app
streamlit run MusafirAI.py
Configuration Requirements
- Cohere or Gemini API key
- Travel documents in travel_documents/ directory
- Python 3.11 (via runtime.txt)
Prompt Engineering
PROMPT_TEMPLATE = """Create Pakistan travel itinerary using context:
Context Information:
{context}
User Query: {question}
Instructions:
1. Create day-by-day itinerary
2. Include key attractions from context
3. Add practical info: travel times, costs
4. Suggest accommodation and dining
5. Include cultural etiquette and safety tips
6. Keep concise and realistic"""
π― Use Cases & Applications
Tourism Industry
- Travel agencies generating personalized packages
- Hotel chains creating neighborhood guides
- Tourism boards promoting regional destinations
Traveler Applications
- Solo travelers discovering hidden gems
- Families planning multi-generational trips
- Adventure seekers finding offbeat experiences
Educational Use
- Cultural studies students exploring regions
- Geography classes analyzing travel patterns
- Language learners exploring local dialects
π¬ Technical Innovations
RAG Pipeline Optimization
- Dynamic chunk sizing based on content type
- Hybrid retrieval (similarity + metadata filtering)
- LLM fallback mechanism (Cohere β Gemini)
Knowledge Management
- Automatic metadata extraction from filenames
- User-uploaded document processing pipeline
- Vector store versioning through force_recreate
Cloud Optimization
- Ephemeral storage handling for serverless environments
- Dependency resolution for Python 3.11
- Secret management with Streamlit TOML
π Future Enhancements
- πΊοΈ Interactive Pakistan map for destination selection
- π¬ Chat interface for itinerary modifications
- π Budget calculator with real-time pricing
Long-Term Vision
- Multi-format support (PDF/Word/YouTube transcripts)
- Hotel/activity booking API integration
- Local language support (Urdu/Pashto)
- AR experience for destination previews
π Project Impact & Value
Technical Contributions
- Complete open-source RAG travel assistant
- Production-ready cloud deployment configuration
- Modular architecture for easy extension
Practical Benefits
- Democratizes travel expertise through AI
- Preserves cultural knowledge in retrievable format
- Increases tourism accessibility for remote regions
Innovation Recognition
- Dual LLM support with failover mechanism
- Context-aware metadata enrichment
- Cloud-native vector store management
π Learning Outcomes & Skills Demonstrated
AI Engineering
- End-to-end RAG system implementation
- Vector database management/optimization
- LLM integration and prompt engineering
Software Development
- Modular architecture with separation of concerns
- Streamlit UI development with custom CSS
- PDF generation with ReportLab
DevOps & Production
- Cloud deployment on Streamlit Community Cloud
- Dependency management for constrained environments
- Secret management and security practices
π Conclusion
MusafirAI represents a cutting-edge application of RAG technology to revolutionize travel planning for Pakistan. By combining semantic search with generative AI, it delivers personalized, culturally-rich itineraries grounded in verified travel knowledge. The project demonstrates professional-grade architecture with production-ready error handling, cloud deployment, and user-friendly interfaces.
The system's modular design allows easy extension to new regions or languages, while its open-source nature provides a valuable resource for developers exploring RAG implementations. MusafirAI stands as a testament to how AI can transform traditional industries while preserving cultural authenticity.
Documentation: Comprehensive setup/usage instructions in README.md

