Business executives need deep market intelligence fast. Traditional consulting firms charge thousands of dollars for reports that take weeks to deliver. AI chatbots provide generic answers lacking depth and structure.
Our goal: Build a system delivering consulting-grade analysis in under 2 minutes, for free.
SRIP deploys four specialized AI agents working in parallel:
Analyzes market size, trends, and opportunities.
Delivers:
Output: 500+ word detailed analysis
Profiles competitors and analyzes dynamics.
Delivers:
Output: 300+ word competitive assessment
Identifies and quantifies risks across dimensions.
Delivers:
Output: 300+ word risk evaluation
Synthesizes insights and generates recommendations.
Delivers:
Output: Strategic recommendations plus synthesis
Frontend: Gradio (Python web UI)
Orchestration: LangGraph (Agent workflow)
LLM Engine: Groq API (LLaMA 3.3 70B)
Validation: Pydantic models + Guardrails
Testing: Pytest (74% coverage)
Groq over OpenAI:
LangGraph over LangChain:
Gradio over Streamlit:
Problem: Sequential execution too slow (4 agents × 30s = 120s)
Solution: Run all agents in parallel using asyncio
async def execute_analysis(query: str): # All agents start simultaneously market_task = asyncio.create_task(market_agent.execute(query)) competitive_task = asyncio.create_task(competitive_agent.execute(query)) risk_task = asyncio.create_task(risk_agent.execute(query)) # Wait for all to complete market, competitive, risk = await asyncio.gather( market_task, competitive_task, risk_task ) # Strategic advisor synthesizes strategy = await strategic_agent.execute( query, market, competitive, risk ) return compile_report(market, competitive, risk, strategy)
Result: Analysis time reduced to 90-95 seconds
Problem: Repeated queries waste API calls
Solution: Cache responses using SHA256 hashing
def _cache_key(self, content: str) -> str: return hashlib.sha256(content.encode()).hexdigest()[:20] def _execute_with_retry(self, messages: list): cache_key = self._cache_key(str(messages)) if cached := self._get_cached(cache_key): return cached result = self.client.chat.completions.create(...) self._set_cache(cache_key, result) return result
Result: Repeated queries complete in under 5 seconds
Problem: Groq free tier has rate limits
Solution: Smart retry with exponential backoff
for attempt in range(5): try: return self.client.chat.completions.create(...) except RateLimitError: wait_time = min(2 ** attempt, 16) logger.warning(f"Rate limit, waiting {wait_time}s") time.sleep(wait_time) continue
Result: 95% success rate even under heavy load
Problem: AI outputs can be incomplete
Solution: Multi-layer quality checks
def _calculate_quality_score(self, result) -> float: scores = [] # Completion check (40%) completion_rate = sum(result.completion_status.values()) / 4 scores.append(completion_rate * 0.4) # Content length check (30%) length_score = sum( 1 for key, min_len in min_lengths.items() if len(getattr(result, key)) >= min_len ) / len(min_lengths) scores.append(length_score * 0.3) # Recommendations check (30%) rec_score = 1.0 if 6 <= len(result.strategic_actions) <= 8 else 0.5 scores.append(rec_score * 0.3) return sum(scores)
Result: Average quality score of 88.3%
Cybersecurity Market Analysis:
Key Insights Generated:
Problem: Strategic Advisor needs outputs from all three specialists
Initial Approach (Bad):
result = strategic_agent.execute( query, market_data, competitive_data, risk_data )
Better Approach:
class AnalysisResult(BaseModel): market_intelligence: str competitive_landscape: str risk_evaluation: str strategic_actions: list[str] @field_validator('strategic_actions') def validate_recommendations(cls, v): if not (6 <= len(v) <= 8): raise ValueError("Must have 6-8 recommendations") return v
Problem: AI returns markdown tables that render poorly
Solution: Convert markdown to HTML dynamically
def _convert_table_to_html(self, table_lines): html = "<table style='...'>" # Parse markdown, generate styled HTML return html
Result: Beautiful, readable tables with gold styling
Problem: Default markdown renders as dark text on black
Solution: Global CSS overrides
.prose *, .markdown * { color: #ffd700 !important; }
Result: All text visible and readable
Unit Tests (17 tests)
├── Agent initialization
├── Cache key generation
├── Caching behavior
└── Recommendation parsing
Integration Tests (3 tests)
├── Complete workflow
├── Workflow without targets
└── Quality calculation
E2E Tests (3 tests)
├── Cloud computing analysis
├── AI chips analysis
└── Metrics tracking
Quality Tests (2 tests)
├── Readability scoring
└── Completeness validation
We Test:
We Don't Test:
Bad prompt:
"Analyze the cybersecurity market"
Good prompt:
"""Conduct comprehensive market intelligence analysis for: {query} Deliver structured analysis covering: - Market size with specific estimates - Historical growth rates (3-5 years) - Projected CAGR for next 3-5 years - Three most significant market trends Provide specific, quantified insights with concrete data points."""
Not all analyses complete perfectly. That's okay.
Build systems that handle failures gracefully.
Groq's free tier delivers:
Cost for 1,000 analyses:
Don't test:
Do test:
| Metric | Value | Target |
|---|---|---|
| Analysis Time | 95s | <120s ✅ |
| Success Rate | 95% | >90% ✅ |
| Quality Score | 88.3% | >75% ✅ |
| Test Coverage | 74% | >70% ✅ |
| Cost per Analysis | $0 | <$0.50 ✅ |
| Word Count (avg) | 3,200 | >2,500 ✅ |
Building SRIP taught us that production-grade AI systems aren't about having the biggest model or most complex architecture. They're about:
The result? A system that delivers professional-quality analysis in 2 minutes at zero cost.
git clone https://github.com/yourusername/srip-production-v2pip install -r requirements.txt && python -m src.ui.gradio_appBuilt as part of Applied AI coursework, demonstrating production-grade multi-agent systems. Currently deployed and tested with real-world queries.
Tech Stack: Python 3.12 • Groq API • LangGraph • Gradio • Pytest
Performance: 95% success rate • 88% average quality • 74% test coverage
Technical Repository:
https://github.com/E-Z1937/srip-production-v2