Ready to unlock your project's true potential? Give RepoSpector AI a try!
In modern software development, a GitHub repository serves as the primary gateway to a project. It is both a portfolio and a user's first impression. While countless hours are dedicated to writing brilliant code, the crucial "last mile" of project presentationβdocumentation, structure, and adherence to community standardsβis often neglected. This gap can significantly hinder a project's adoption, collaboration, and perceived quality.
RepoSpector AI was developed to address this challenge directly. It is a sophisticated multi-agent system designed to act as an automated senior engineering reviewer. By leveraging a crew of specialized AI agents, it performs a comprehensive analysis of any public GitHub repository and delivers actionable, expert-level feedback. This project moves beyond simple linting tools by evaluating the holistic quality of a project's presentation.
This publication details the design, architecture, and implementation of RepoSpector AI. Developed as a deliverable for the Agentic AI Developer Certification Program, it serves as both a practical, high-utility tool and an educational blueprint for building advanced, agentic applications with modern engineering practices.
The system is built on a foundation of cutting-edge technologies to create a robust, interactive, and intelligent solution. CrewAI provides the orchestration framework for defining and managing our specialized agents. Streamlit was chosen to build the modern and intuitive web interface, making the tool accessible to all users without requiring command-line interaction. At its core, OpenAI's GPT-4 provides the reasoning capabilities for analysis and report generation.
The system architecture was guided by five primary objectives:
RepoSpector AI's operation is modeled after a real-world peer review process. It unfolds in three distinct phases, with each agent taking the lead in its area of expertise.
The process begins when a user submits a GitHub URL through the Streamlit interface. This triggers the RepoAnalyst
agent.
GitPython
library to securely clone the target repository into a temporary, isolated environment. This provides full access to the project's files and structure.RepoAnalyst
traverses the repository's file system to check for the presence and proper placement of critical components. It programmatically verifies the existence of a LICENSE
file, a .gitignore
file, core application code within a src/
directory, and tests within a tests/
directory.README.md
and any dependency files (requirements.txt
or pyproject.toml
), preparing this information for the next phase. The output of this phase is a structured JSON object containing the raw content and a summary of the structural analysis, which is then passed as context to the next agent.With the structural analysis complete, the DocumentationSpecialist
takes over. This agent's sole focus is on the quality of the project's primary user-facing document: the README.md
.
README.md
content provided by the RepoAnalyst
, this agent assesses it against a rubric of best practices. It checks for key sections such as a project overview, installation instructions, usage examples, and license information.DocumentationSpecialist
formulates a qualitative assessment, noting both the strengths of the documentation and its specific weaknesses. This analysis is then passed on to the final agent.The final phase is managed by the ChiefReviewer
agent, which acts as the project lead. It receives the structured data from the RepoAnalyst
and the qualitative feedback from the DocumentationSpecialist
.
LICENSE
is flagged as more critical than a minor typo in the README
.ChiefReviewer
drafts the final, user-facing markdown report. It structures the feedback into clear sections: an overall summary, a list of positive attributes ("β
What's Good"), a prioritized list of issues ("β οΈ Areas for Improvement"), and a concrete checklist for remediation ("π Action Plan").The project follows a clean, modular architecture that promotes maintainability and extensibility, with a clear separation of concerns between the user interface, agentic core, and supporting tools.
app.py
(Main Application Interface): The entry point for the Streamlit web application. It handles all user interaction, state management, and communication with the agentic backend.src/repospector_ai/
(Core Logic Module):
agents.py
: Defines the three specialized CrewAI agents (RepoAnalyst
, DocumentationSpecialist
, ChiefReviewer
), including their roles, goals, and assigned tools.tasks.py
: Defines the specific tasks for each agent and orchestrates the sequence of the review process within the CrewAI framework.tools/repo_analysis_tool.py
: Contains the custom, robust tool for cloning and analyzing the file structure of a GitHub repository.core/config.py
: Manages all application settings using Pydantic, securely loading API keys and other configurations from environment variables.core/logger.py
: Implements a centralized, structured JSON logger for professional-grade logging and debugging.tests/
(Testing Suite): Contains all unit tests written with pytest
. This ensures the reliability of critical components like the repo_analysis_tool
.Dockerfile
(Containerization): A multi-stage Dockerfile allows for building a lightweight, production-ready container for easy deployment and scalability.requirements.txt
& requirements-dev.txt
: Specifies all dependencies for production and development..pre-commit-config.yaml
: Configures automated code quality checks with tools like Black, Ruff, and MyPy.The system requires Python 3.11+ and an OpenAI API key for its operation.
git clone https://github.com/YanCotta/repospector-ai.git cd repospector-ai
python -m venv .venv source .venv/bin/activate
pip install -r requirements.txt
.env
file and add your OpenAI API key.
cp .env.example .env # Edit the .env file with your key
The application will be accessible atstreamlit run app.py
http://localhost:8501
in your web browser.This project successfully demonstrates the design and implementation of a high-utility, interactive multi-agent system. By combining the powerful orchestration of CrewAI with the accessibility of a Streamlit interface, RepoSpector AI effectively transforms the complex, nuanced task of a code repository review into an automated, on-demand service. The modular architecture and adherence to professional engineering practices not only ensure the tool is robust and maintainable but also establish this project as a valuable, practical blueprint for developers building their own sophisticated AI applications.
While the current implementation provides a strong foundation, several exciting avenues for future development exist:
bandit
for security, radon
for complexity) as new capabilities for the RepoAnalyst
agent.ChiefReviewer
to not only suggest changes but to draft and propose a complete, improved README.md
file that the user could adopt directly.DocumentationSpecialist
to use its web search tool to find and reference best-in-class README.md
files from similar projects as examples in its feedback.