π Abstract
The Multi-Agent System Analyzer is a fully automated, AI-driven framework designed to analyze software codebases, extract structured metadata, detect improvement opportunities, evaluate documentation quality, and generate actionable reports.
By leveraging role-specialized agentsβAnalyzerAgent, MetadataAgent, and ImproverAgentβcoordinated through a centralized orchestrator, the system demonstrates real-world agentic AI principles such as autonomous reasoning, tool usage, iterative refinement, shared context, and coordinated decision-making.
This project illustrates how multi-agent systems can significantly accelerate developer workflows, reduce manual code review effort, and deliver intelligent, context-aware insights across diverse repositories.
π Table of Contents
1οΈβ£ Introduction
Modern development teams increasingly work with complex repositories where documentation is outdated, code quality varies, and architectural intent is unclear. Developers often spend hours attempting to understand:
β’ What the project does
β’ How the codebase is structured
β’ Which improvements are needed
β’ What documentation is missing
Traditional linters and static analyzers focus primarily on syntax and style rules. They lack contextual understanding and cannot reason about architectural intent, documentation quality, or repository-level health.
The Multi-Agent System Analyzer addresses this gap by applying coordinated AI agents capable of contextual reasoning and holistic repository analysis.
2οΈβ£ Why Multi-Agent Intelligence?
Instead of relying on a monolithic model, this system employs multiple specialized agents, each responsible for a distinct reasoning layer:
Agent Responsibility
AnalyzerAgent Understands file content, detects issues, analyzes structure
MetadataAgent Extracts project-level insights (name, tech stack, structure)
ImproverAgent Suggests improvements, best practices, and missing documentation
Through orchestration and shared context, these agents collectively produce a deep, human-like understanding of any repositoryβfar exceeding what a single-pass analyzer could achieve.
3οΈβ£ Real-World Applications
This system is designed with practical, production-relevant use cases in mind:
β’ Automated Code Reviews β Accelerate pull request reviews by generating structured insights before human review.
β’ Repository Onboarding β Help new developers quickly understand unfamiliar codebases.
β’ Technical Debt Assessment β Identify architectural weaknesses, documentation gaps, and maintainability risks.
β’ Open-Source Quality Audits β Evaluate repository readiness for contributors and long-term maintenance.
β’ AI-Assisted Documentation Generation β Provide intelligent suggestions for missing or incomplete documentation.
These applications demonstrate the systemβs value beyond experimentation, positioning it as a developer productivity and quality-assurance tool.
4οΈβ£ System Requirements
π₯οΈ Technical Requirements
β’ Python 3.10+
β’ Minimum 4GB RAM (8GB recommended)
β’ Internet connectivity
β’ ~1GB disk space
π API Requirements
A .env file must be created containing:

π API Requirements
Create a .env file containing:

Implementation
π Project Structure
The Multi-Agent System is organized using a modular, agent-driven architecture. Each directory has a clear responsibility, enabling scalability, maintainability, and clean separation of concerns between agents, tools, UI, and utilities.

Setting up the environment
Create a new virtual environment and install the dependencies using requirements.txt file.
requirements.txt

.env
Create a .env file and add your API key.

Step 3 β Install Dependencies
π₯ ui/ β Streamlit User Interface
The UI layer provides an interactive interface for executing the multi-agent workflow.
β’ app.py
Allows users to upload repositories, trigger analysis, and view structured outputs in real time.
This component demonstrates end-user accessibility and practical applicability of the agentic system.
app.py
This is the main entry point of the application built in Streamlit. It uses requests to interact with the FastAPI endpoints at the backend.

BACKEND
π main.py β Orchestration Entry Point
The main execution file initializes agents, invokes tools, coordinates agent interactions, and aggregates results into a unified output. It acts as the central orchestrator for the multi-agent workflow.

π§© utils/ β Reliability, Safety, and Observability
This directory contains cross-cutting utilities that improve system robustness and production readiness.
β’ Guardrails (guardrails.py) β Input and output validation
β’ Retries (retries.py) β Fault-tolerant retry mechanisms
β’ Monitoring (monitoring.py) β Logging and execution tracking
β’ Dummy LLMs (dummy_llm.py, your_dummy_llm.py) β Mock models for offline testing and validation
These utilities ensure the system remains stable, debuggable, and resilient under failure scenarios.
-->utils.py
dummy_llm.py

guardrails.py

monitoring.py

retries.py

π Directory Breakdown
π§ agents/
Contains autonomous, role-based AI agents, each responsible for a specific reasoning task.
β’ analyzer_agent.py
o Analyzes repository structure and source code
o Detects architectural issues, gaps, and inconsistencies
β’ improver_agent.py
o Suggests enhancements for code quality, structure, and documentation
o Focuses on best practices and maintainability improvements
β’ metadata_agent.py
o Extracts high-level project metadata
o Identifies dependencies, configuration patterns, and repository context
This folder demonstrates agent specialization and collaborative intelligence, a core principle of agentic AI systems.
--> agents
analyzer_agent.py

improver_agent.py

metadata_agent.py

π tools/
Reusable tools that agents rely on to interact with external data and repositories.
β’ repo_reader.py
o Reads repository files and directories
o Handles file traversal and safe access
β’ summarizer.py
o Condenses agent outputs into structured summaries
β’ web_search_tool.py
o Supports external knowledge lookup when enabled
o Enhances reasoning with contextual information
-->tools
repo_reader.py

summarizer.py

web_search_tool.py

π§ͺ sample_repo/ β Demonstration & Validation Repository
A controlled example repository used to test and validate agent behavior without impacting real-world projects. This enables repeatable evaluation during certification review.
π¦ Supporting Infrastructure
β’ tests/ β Unit and integration tests
β’ config/ β Configuration files
β’ logs/ β Runtime logs and monitoring output
β’ requirements.txt β Dependency management
β’ .env β Environment variable configuration
β’ README.md β Documentation and usage guide
How to Run the System
π’ Option 1 β Run Multi-Agent Workflow

π£ Option 2 β Run Streamlit UI


Example Output:-

JSON OUTPUT:
{
"analysis": {
"project_overview": "The repository appears to contain a Python-based multi-agent system designed for automated code analysis, metadata extraction, and improvement recommendations. It centers around modular agents (AnalyzerAgent, MetadataAgent, ImproverAgent) coordinated through a main orchestrator script. The system relies on OpenAI models via LangChain and is capable of reading repositories, analyzing structure, and generating insights.",
"detected_issues": [
"README.md missing or incomplete in the repository",
"Inconsistent naming conventions across certain Python modules",
"No comprehensive documentation for agents behavior and expected outputs",
"Lack of structured logs in some parts of the system",
"Missing tests for edge cases in MetadataAgent and AnalyzerAgent",
"Environment variables not validated before model initialization"
],
"documentation_gaps": [
"Project goal not clearly described",
"Missing installation instructions for beginners",
"No explanation of the multi-agent architecture",
"No usage examples or input/output samples",
"No explanation of how .env variables influence workflow"
]
},
"metadata": {
"project_name": "Multi-Agent System Analyzer",
"description": "A modular multi-agent system built using Python, LangChain, and OpenAI models to automatically analyze repository contents, extract metadata, and generate structured recommendations.",
"technologies": [
"Python",
"LangChain",
"OpenAI GPT-4o-mini",
"Streamlit (optional UI)",
"dotenv",
"Logging utilities"
],
"author": "Unknown (not specified in repository)",
"tags": [
"multi-agent",
"AI agents",
"repository-analysis",
"automation",
"langchain"
],
"license": "MIT (or unspecified)",
"repo_structure": {
"root_files": [
"main.py",
"requirements.txt",
"README.md",
".env"
],
"directories": [
"agents/",
"tools/",
"utils/",
"ui/",
"tests/",
"logs/"
],
"agents": {
"AnalyzerAgent": "Performs repository content analysis and summarizes issues.",
"MetadataAgent": "Extracts structured metadata from README or repo content.",
"ImproverAgent": "Generates improvements and best-practice recommendations."
},
"raw_output": "{... full extracted JSON from model ...}"
}
},
"improvements": {
"code_quality": [
"Refactor repeated code into utility functions under utils/",
"Add type annotations across all agent methods",
"Use consistent snake_case naming convention",
"Improve exception handling around LLM API interactions",
"Introduce retry mechanisms for rate-limits"
],
"documentation": [
"Add comprehensive README.md including architecture diagram",
"Document each agent class with docstrings",
"Add examples of input and output in README",
"Include troubleshooting steps for API issues",
"Provide clear environment configuration guide"
],
"architecture": [
"Create dedicated Orchestrator class instead of mixing logic in main.py",
"Add agent registry for easier addition/removal of new agents",
"Introduce message bus or event system for cleaner communication",
"Implement caching layer for repeated LLM calls"
],
"best_practices": [
"Use .env.example template for sharing environment variables",
"Create CONTRIBUTING.md for open-source collaboration",
"Set up GitHub Actions for tests and lint checks",
"Add pre-commit hooks to enforce formatting and linting",
"Use semantic versioning for future releases"
]
},
"summary": {
"highlights": [
"The repository implements a multi-agent workflow using LangChain and OpenAI.",
"Agents collaborate to analyze code, extract metadata, and suggest improvements.",
"Documentation is the most critical missing component.",
"Architecture can be enhanced with better modularization and agent orchestration."
],
"overall_quality_score": 7.2,
"recommendation": "Add full documentation, improve consistency across modules, and enhance architecture for scalability."
}
}

π§© Key Concepts
β’ Role-specialized agents
β’ Shared memory
β’ LLM-driven reasoning
β’ Fault-tolerant pipelines
β’ Structured outputs
π Resilience, Observability & Reliability
To enhance production readiness, the system incorporates or recommends:
β’ Retry mechanisms for LLM API failures
β’ Timeout handling for long-running agent tasks
β’ Structured logging for agent decisions
β’ Graceful fallback agents to prevent workflow failure
β’ Clear error messaging for configuration issues
These features ensure stability, debuggability, and long-term maintainability.
β¨ Features
β Multi-Agent Reasoning Pipeline
β Automated Metadata Extraction
β Code Quality & Architecture Analysis
β Documentation Gap Detection
β Streamlit-Based Interactive UI
β Exportable JSON Reports
β Robust Error Handling
π Future Enhancements
β’ Agent memory persistence
β’ Plugin-based agent registry
β’ Caching layer for repeated LLM calls
β’ GitHub Actions integration
β’ Multi-repo batch analysis
π License
This project is released under the MIT License.