Source Selection: Determining which knowledge source is most appropriate for a given query
Real-time Information: Accessing up-to-date information beyond training data cutoffs
Multi-domain Queries: Handling questions that span current events, academic research, and encyclopedic knowledge
Response Quality: Ensuring factual accuracy and minimizing hallucinations
1.2 Proposed Solution
We present an agentic AI system that:
Automatically selects appropriate tools based on query context
Orchestrates multiple searches when necessary
Synthesizes information from diverse sources
Provides cited responses to ensure transparency and verifiability
Maintains conversation history using typed dictionaries
Implements message accumulation for context preservation
Supports conditional routing based on tool requirements
2.1.2 Language Model Configuration
Primary Model: OpenAI GPT-OSS-20B via Groq
Temperature: 1.0 (balanced creativity and accuracy)
Reasoning Effort: Medium
Purpose: Tool selection and response synthesis
2.1.3 Multi-Tool Ecosystem

2.2 Intelligent Tool Selection
The agent employs a sophisticated prompt-based routing system:
Decision Criteria:
Temporal relevance: Recent events → Tavily
Conceptual queries: Definitions, explanations → Wikipedia
Academic queries: Research, technical information → ArXiv
Hybrid queries: Multiple tools in parallel
2.3 Safety and Ethics Implementation
Built-in Safeguards:
Rejection of unethical/illegal information requests
Source citation requirements
Instruction injection prevention
Language-adaptive responses (English/French)
Uses embeddings to compute semantic similarity
Measures focus and pertinence of response
Penalizes verbose or off-topic content
3.1.3 Context Recall
Definition: Assesses whether all relevant information from the ground truth is captured in the retrieved context.
Formula:
Context Recall = (Relevant context retrieved) / (Total relevant context available)
Importance: Ensures comprehensive information retrieval
3.2 Evaluation Infrastructure
Configuration:
Evaluator LLM: Llama-3.1-8B-Instant (Groq)
Embeddings: sentence-transformers/all-MiniLM-L6-v2
Test Dataset: 5 diverse queries spanning multiple domains
Metrics: Faithfulness, Answer Relevancy, Context Recall
Test Categories:
Scientific concepts (Quantum physics, Machine learning)
Current events (AI in Africa)
Historical facts (Penicillin discovery)
Recent developments (Nuclear fusion)
Tool failures don't crash the system
Error messages logged for debugging
Alternative sources attempted when available
User-friendly error communication
4.3 Performance Optimizations
Key Improvements:
Increased retrieval depth: max_results=7 (Tavily), top_k=5 (others)
Advanced search mode: Enhanced relevance filtering
Parallel tool execution: Reduced latency for multi-source queries
Context-based synthesis: "Base your answer only on retrieved context"
Conclusion
This work demonstrates the practical implementation of an intelligent multi-tool research assistant that addresses key challenges in modern information retrieval. By combining LangGraph's orchestration capabilities, Groq's high-performance inference, and a curated set of specialized tools, we achieve:
Intelligent Automation: Context-aware tool selection without manual intervention
Quality Assurance: Measurable performance through RAGAS evaluation
Production Readiness: Robust error handling and scalable architecture
The system represents a significant step toward autonomous, reliable, and user-friendly AI assistants capable of handling diverse information needs across multiple domains.