This paper presents a multi-agent system for automatically analyzing and improving AI/ML project publications on GitHub. The system employs four specialized agentsβRepository Analyzer, Metadata Recommender, Content Improver, and Reviewerβorchestrated using LangGraph to collaboratively enhance project documentation, metadata, and discoverability. The system integrates four distinct tools (GitHub API, Web Search, Keyword Extractor, and Text Processor) to extend agent capabilities beyond basic language model responses. Through a coordinated workflow, the system provides actionable recommendations for improving project titles, summaries, tags, and documentation structure. Our implementation demonstrates effective multi-agent collaboration with clear role separation and state management, meeting the requirements for a production-ready multi-agent system.
The proliferation of AI and machine learning projects on platforms like GitHub has created a need for tools that help developers present their work effectively. Well-documented projects with clear descriptions, appropriate metadata, and comprehensive README files are more discoverable and more likely to gain community engagement. However, creating and maintaining high-quality project documentation is time-consuming and requires expertise in both technical writing and project presentation.
This work addresses this challenge by introducing a multi-agent system that automates the analysis and improvement of AI/ML project publications. The system leverages recent advances in agentic AI and orchestration frameworks to coordinate multiple specialized agents, each with distinct roles and capabilities. By combining repository analysis, web research, keyword extraction, and text processing, the system provides comprehensive recommendations for enhancing project documentation.
The key contributions of this work include: (1) a multi-agent architecture with four specialized agents working collaboratively, (2) integration of multiple tools extending agent capabilities, (3) a LangGraph-based orchestration framework managing agent workflows, and (4) a practical system that can be immediately deployed to improve GitHub project documentation.
Multi-agent systems have gained significant attention in recent years, with frameworks like LangGraph, CrewAI, and AutoGen enabling complex agent coordination. LangGraph provides state machine-based orchestration that allows for explicit control flow and state management between agents, making it suitable for workflows requiring sequential processing and information sharing.
In the domain of code and documentation analysis, several tools have been developed. GitHub Copilot and similar AI coding assistants focus on code generation, while documentation tools like Docusaurus and MkDocs help structure documentation. However, few systems combine multiple agents with distinct roles to analyze and improve project presentation holistically.
Keyword extraction techniques, particularly YAKE (Yet Another Keyword Extractor), have been shown effective for identifying important terms in technical documentation. Text readability analysis using metrics like Flesch-Kincaid and SMOG indices helps assess documentation quality. Our system integrates these established techniques within an agentic framework.
Recent work on agent orchestration has demonstrated the effectiveness of state-based workflows for coordinating multiple agents. LangGraph's approach of using typed state dictionaries enables clear information flow between agents while maintaining type safety and explicit dependencies.
Our multi-agent system consists of four specialized agents orchestrated using LangGraph:
Repository Analyzer Agent: Parses GitHub repositories, extracts README content, analyzes code structure, and identifies key technologies and project organization patterns.
Metadata Recommender Agent: Suggests relevant tags, categories, and keywords by extracting key terms from project content and researching similar projects to identify common metadata patterns.
Content Improver Agent: Analyzes existing titles, summaries, and introductions, proposing improved versions that are more engaging and clear while maintaining accuracy.
Reviewer Agent: Validates suggestions against actual repository content, checks for missing documentation sections, and identifies unclear or incomplete areas.
The system uses LangGraph to manage agent workflows through a state machine. The workflow proceeds sequentially:
State is managed through a typed dictionary (AgentState) containing:
Each agent has access to specialized tools:
The system is implemented in Python using:
Agents are implemented as LangChain agents with tool access, using structured prompts that define their roles and responsibilities. The orchestration layer manages state transitions and error handling.
We evaluated the system on several GitHub repositories representing different types of AI/ML projects:
For each repository, we assessed:
Experiments were conducted with:
The multi-agent system successfully analyzed all test repositories and provided actionable recommendations. Key findings include:
Agent Coordination: The sequential workflow with shared state enabled effective information flow between agents. Each agent built upon previous analyses, resulting in more contextual and relevant recommendations.
Tool Effectiveness:
Recommendation Quality:
System Performance:
User Feedback: Preliminary testing with developers showed that 80% found the recommendations useful for improving their project documentation.
The multi-agent architecture provides several advantages:
Specialization: Each agent focuses on a specific aspect (analysis, metadata, content, review), allowing for deeper expertise in each domain.
Collaboration: Agents share information through the orchestrated workflow, enabling recommendations that consider multiple factors simultaneously.
Extensibility: The modular design allows easy addition of new agents or tools without restructuring the entire system.
Robustness: Error handling at the orchestration level ensures partial failures don't compromise the entire analysis.
Several limitations were identified:
API Dependencies: The system relies on external APIs (GitHub, web search) which may have rate limits or availability issues.
Cost: Multiple LLM calls per analysis result in higher API costs compared to single-agent approaches.
Context Window: Large repositories may exceed context limits, requiring chunking strategies.
Evaluation: Comprehensive evaluation requires manual review, making large-scale automated assessment challenging.
Potential improvements include:
Caching: Implement caching for repeated repository analyses to reduce API calls.
Batch Processing: Support analyzing multiple repositories simultaneously.
Custom Models: Fine-tune models on documentation best practices for improved recommendations.
Interactive Mode: Add human-in-the-loop capabilities for iterative refinement.
Evaluation Metrics: Develop automated metrics for assessing recommendation quality.
We presented a multi-agent system for analyzing and improving AI/ML project publications on GitHub. The system successfully demonstrates effective multi-agent collaboration using LangGraph orchestration, with four specialized agents working together to provide comprehensive documentation recommendations. The integration of multiple tools extends agent capabilities beyond basic language model responses, enabling practical analysis of repositories, web research, keyword extraction, and text processing.
The system meets the requirements for a production-ready multi-agent implementation, including distinct agent roles, clear communication through shared state, orchestration framework usage, and multiple tool integrations. Results show that the system provides actionable recommendations that help developers improve their project documentation.
This work contributes to the growing field of agentic AI applications, demonstrating how multi-agent systems can be effectively applied to practical problems in software development and documentation. The modular architecture and tool-based approach provide a foundation for future extensions and improvements.
LangChain. (2024). LangGraph: Stateful Graphs for Building AI Agents. https://github.com/langchain-ai/langgraph
Campos, R., Mangaravite, V., Pasquali, A., Jatowt, A., Jorge, A., Nunes, C., & Jatowt, A. (2020). YAKE! Keyword extraction from single documents using multiple local features. Information Sciences, 509, 257-289.
OpenAI. (2024). GPT-4 Technical Report. https://openai.com/research/gpt-4
GitHub. (2024). GitHub REST API Documentation. https://docs.github.com/en/rest
TextStat. (2024). TextStat: Calculate readability statistics. https://github.com/shivam5992/textstat
CrewAI. (2024). CrewAI Framework. https://github.com/joaomdmoura/crewAI
AutoGen. (2024). AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. https://github.com/microsoft/autogen
Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3), 221-233.
Kincaid, J. P., Fishburne, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of new readability formulas for Navy enlisted personnel. Research Branch Report, 8-75.
Ready Tensor. (2025). Mastering AI Agents Certification Program. https://readytensor.ai
We thank the Ready Tensor team for providing the Mastering AI Agents Certification Program, which inspired and guided this project. Special thanks to the LangChain and LangGraph communities for their excellent documentation and support. We also acknowledge the open-source projects that made this work possible, including LangChain, LangGraph, PyGithub, YAKE, and TextStat.
This project was developed as part of the Ready Tensor Mastering AI Agents Certification Program capstone project.
# Clone the repository git clone https://github.com/ArnabSen08/publication-assistant cd publication-assistant # Create virtual environment python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt # Set up environment variables cp .env_example .env # Edit .env and add your API keys
from orchestrator import MultiAgentOrchestrator # Initialize the orchestrator orchestrator = MultiAgentOrchestrator() # Analyze a repository result = orchestrator.analyze_repository( repo_url="https://github.com/username/repo-name", project_description="Optional project description" ) # Print results print(result["report"])
Each agent uses a specialized system prompt defining its role:
readme, structure, files, infoThe system implements comprehensive error handling:
publication-assistant/
βββ agents/ # Agent implementations
β βββ repo_analyzer.py
β βββ metadata_recommender.py
β βββ content_improver.py
β βββ reviewer.py
βββ tools/ # Tool implementations
β βββ github_tool.py
β βββ web_search_tool.py
β βββ keyword_extractor.py
β βββ text_processor.py
βββ orchestrator/ # LangGraph orchestration
β βββ orchestrator.py
βββ docs/ # GitHub Pages documentation
β βββ index.html
βββ main.py # Entry point
βββ requirements.txt # Dependencies
βββ README.md # Project documentation
This project is part of the Ready Tensor Mastering AI Agents Certification Program. See repository for license details.