MAS4MAS

🤖 Meta-Agentic System

A Multi-Agent AI Framework for Automated System Generation

The Meta-Agentic System is an innovative framework that leverages multiple Large Language Models (LLMs) to automatically design, implement, test, and deploy custom multi-agent systems from high-level user prompts. Think of it as "AI that creates AI systems."

🌟 Overview

This system employs a sophisticated workflow with four specialized agents:

PlannerAgent - Designs the architecture and specifications
CoderAgent - Generates complete Python implementations
TesterAgent - Validates code quality and plan adherence
DeployerAgent - Packages and deploys the generated system

The orchestrator manages the workflow, including an iterative refinement loop that ensures high-quality output.

🏗️ Architecture

System Flow

User Prompt → Planner → Coder → Tester → [Refine if needed] → Deploy
                                   ↓
                            (Success Check)

LLM Model Assignment

The system intelligently selects the best LLM for each task:

PlannerAgent: claude-3-opus-20240229 (Anthropic) - Architecture & planning
CoderAgent: gpt-4o (OpenAI) - Code generation & refinement
TesterAgent: gemini-1.5-pro-latest (Google) - Quality assurance

📁 Project Structure

Core Files

`main.py`

Purpose: Entry point for the application

Key Components:

Initializes the system and displays welcome message
Captures user input describing the desired multi-agent system
Instantiates the Orchestrator with configuration (max_loops=3)
Triggers the complete workflow execution

Usage Example:

python main.py
# Prompts: "What multi-agent system would you like to create?"

`orchestrator.py`

Purpose: Central coordinator managing state and agent workflow

Key Components:

Orchestrator class maintains global state across agents
Manages the execution sequence: Plan → Code → Test → Deploy
Implements iterative refinement loop (up to 3 attempts)
Handles error propagation and workflow termination

State Management:

state = {
    "user_prompt": str,
    "plan": dict,
    "generated_code": dict,
    "test_results": dict,
    "error": str (if any)
}

Workflow Logic:

Planning Phase: PlannerAgent generates JSON specification
Coding Phase: CoderAgent generates code files
Testing Phase: TesterAgent validates output
Refinement Loop: If tests fail, code is sent back to CoderAgent
Deployment Phase: On success, DeployerAgent creates project structure

`agents.py`

Purpose: Defines the four specialized agent classes

Agent Classes:

BaseAgent (Abstract Base Class)
- Common interface for all agents
- Provides name attribute and abstract run() method
PlannerAgent
- Converts user prompts into structured JSON specifications
- Outputs: project_name, agents_to_create, required_tools, workflow, dependencies
- Uses Claude for architectural thinking
CoderAgent
- Generates complete Python implementation from plan
- Creates: main.py, agents.py, tools.py, requirements.txt
- Includes refinement capability when tests fail
- Uses GPT-4o for code generation
TesterAgent
- Performs static code analysis without execution
- Checks plan adherence, syntax, imports, and logic
- Returns structured JSON with success status and actionable errors
- Uses Gemini for comprehensive analysis
DeployerAgent
- Creates directory structure for generated system
- Writes all code files to disk
- Generates README.md for the new project
- Only executes if tests pass

Agent Communication Pattern:

def run(self, state: dict) -> dict:
    # Process state
    # Call LLM
    # Update state
    return state

`llm.py`

Purpose: Unified interface for multiple LLM providers

Key Features:

Abstracts differences between Anthropic, OpenAI, and Google APIs
Configures clients with API keys from config.py
Maps logical model names to actual model identifiers
Provides consistent call_llm() interface

Function Signature:

def call_llm(model_name: str, system_prompt: str, user_prompt: str) -> str

Model Routing:

Models containing "claude" → Anthropic API
Models containing "gpt" → OpenAI API
Models containing "gemini" → Google Generative AI API

Error Handling: Returns descriptive error messages on API failures

`prompts.py`

Purpose: Centralized prompt templates for all agents

Prompt Definitions:

PLANNER_PROMPT
- Instructs Claude to act as system architect
- Defines strict JSON output format
- Emphasizes unambiguous specifications
- Template variables: {{USER_PROMPT}}
CODER_PROMPT
- Instructs GPT to act as Python developer
- Specifies exact file structure requirements
- Emphasizes plan adherence
- Template variables: {{PLAN}}
CODER_REFINEMENT_PROMPT
- Used when initial code fails tests
- Provides original plan, previous code, and error report
- Instructs to output complete corrected files
- Template variables: {{PLAN}}, {{PREVIOUS_CODE}}, {{TEST_REPORT}}
TESTER_PROMPT
- Instructs Gemini to act as QA engineer
- Defines five-step analysis process
- Specifies JSON response format with success flag
- Template variables: {{PLAN}}, {{CODE_TO_TEST}}

Design Philosophy: Prompts are designed to produce structured, parseable JSON outputs while ensuring each agent understands its role and constraints.

`config.py`

Purpose: Secure storage for API credentials

Configuration:

API_KEYS = {
    "google": "YOUR_GOOGLE_API_KEY_HERE",
    "openai": "YOUR_OPENAI_API_KEY_HERE",
    "anthropic": "YOUR_ANTHROPIC_API_KEY_HERE"
}

Security Note: Add config.py to .gitignore to prevent committing API keys

Required APIs:

Google Generative AI API (for Gemini)
OpenAI API (for GPT-4)
Anthropic API (for Claude)

`requirements.txt`

Purpose: Python package dependencies

Dependencies:

google-generativeai>=0.8.0  # Google Gemini API
openai>=1.0.0               # OpenAI API
anthropic>=0.34.0           # Anthropic Claude API

🚀 Installation & Setup

Prerequisites

Python 3.9 or higher
API keys from:

Installation Steps

Clone the repository
```
git clone <repository-url>
cd mas
```

Create virtual environment (recommended)

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies
```
pip install -r requirements.txt
```

Configure API keys

Edit config.py and add your API keys:

API_KEYS = {
    "google": "your-actual-google-key",
    "openai": "your-actual-openai-key",
    "anthropic": "your-actual-anthropic-key"
}

Run the system
```
python main.py
```

💡 Usage Example

Input Prompt

"Create a multi-agent research system where one agent searches for recent 
articles on AI safety, another agent summarizes them, and a third agent 
creates a markdown report."

Generated System Structure

generated_research_system/
├── README.md
├── requirements.txt
├── main.py           # Orchestrator for the research system
├── agents.py         # ResearchAgent, SummaryAgent, ReportAgent
└── tools.py          # web_search(), save_to_file()

🔄 Iterative Refinement

The system includes a sophisticated feedback loop:

Code generated by CoderAgent
TesterAgent analyzes and reports issues
If failures detected → Code + Error Report → Refined Code
Maximum 3 iterations before termination
Deploy only on successful validation

This ensures high-quality, working implementations.

🎯 Key Features

Multi-Model Architecture: Leverages best model for each task
Automated Testing: Built-in quality assurance
Iterative Refinement: Self-correcting code generation
Complete Deployment: Ready-to-run generated systems
Extensible Design: Easy to add new agents or models

🤝 Contributing

Contributions welcome! Please ensure:

Code follows PEP 8 style guidelines
All existing tests pass
New features include appropriate tests
Documentation is updated

🤖 Meta-Agentic System

A Multi-Agent AI Framework for Automated System Generation

🌟 Overview

This system employs a sophisticated workflow with four specialized agents:

PlannerAgent - Designs the architecture and specifications
CoderAgent - Generates complete Python implementations
TesterAgent - Validates code quality and plan adherence
DeployerAgent - Packages and deploys the generated system

The orchestrator manages the workflow, including an iterative refinement loop that ensures high-quality output.

🏗️ Architecture

System Flow

User Prompt → Planner → Coder → Tester → [Refine if needed] → Deploy
                                   ↓
                            (Success Check)

LLM Model Assignment

The system intelligently selects the best LLM for each task:

PlannerAgent: claude-3-opus-20240229 (Anthropic) - Architecture & planning
CoderAgent: gpt-4o (OpenAI) - Code generation & refinement
TesterAgent: gemini-1.5-pro-latest (Google) - Quality assurance

📁 Project Structure

Core Files

`main.py`

Purpose: Entry point for the application

Key Components:

Initializes the system and displays welcome message
Captures user input describing the desired multi-agent system
Instantiates the Orchestrator with configuration (max_loops=3)
Triggers the complete workflow execution

Usage Example:

python main.py
# Prompts: "What multi-agent system would you like to create?"

`orchestrator.py`

Purpose: Central coordinator managing state and agent workflow

Key Components:

Orchestrator class maintains global state across agents
Manages the execution sequence: Plan → Code → Test → Deploy
Implements iterative refinement loop (up to 3 attempts)
Handles error propagation and workflow termination

State Management:

state = {
    "user_prompt": str,
    "plan": dict,
    "generated_code": dict,
    "test_results": dict,
    "error": str (if any)
}

Workflow Logic:

Planning Phase: PlannerAgent generates JSON specification
Coding Phase: CoderAgent generates code files
Testing Phase: TesterAgent validates output
Refinement Loop: If tests fail, code is sent back to CoderAgent
Deployment Phase: On success, DeployerAgent creates project structure

`agents.py`

Purpose: Defines the four specialized agent classes

Agent Classes:

BaseAgent (Abstract Base Class)
- Common interface for all agents
- Provides name attribute and abstract run() method
PlannerAgent
- Converts user prompts into structured JSON specifications
- Outputs: project_name, agents_to_create, required_tools, workflow, dependencies
- Uses Claude for architectural thinking
CoderAgent
- Generates complete Python implementation from plan
- Creates: main.py, agents.py, tools.py, requirements.txt
- Includes refinement capability when tests fail
- Uses GPT-4o for code generation
TesterAgent
- Performs static code analysis without execution
- Checks plan adherence, syntax, imports, and logic
- Returns structured JSON with success status and actionable errors
- Uses Gemini for comprehensive analysis
DeployerAgent
- Creates directory structure for generated system
- Writes all code files to disk
- Generates README.md for the new project
- Only executes if tests pass

Agent Communication Pattern:

def run(self, state: dict) -> dict:
    # Process state
    # Call LLM
    # Update state
    return state

`llm.py`

Purpose: Unified interface for multiple LLM providers

Key Features:

Abstracts differences between Anthropic, OpenAI, and Google APIs
Configures clients with API keys from config.py
Maps logical model names to actual model identifiers
Provides consistent call_llm() interface

Function Signature:

def call_llm(model_name: str, system_prompt: str, user_prompt: str) -> str

Model Routing:

Models containing "claude" → Anthropic API
Models containing "gpt" → OpenAI API
Models containing "gemini" → Google Generative AI API

Error Handling: Returns descriptive error messages on API failures

`prompts.py`

Purpose: Centralized prompt templates for all agents

Prompt Definitions:

PLANNER_PROMPT
- Instructs Claude to act as system architect
- Defines strict JSON output format
- Emphasizes unambiguous specifications
- Template variables: {{USER_PROMPT}}
CODER_PROMPT
- Instructs GPT to act as Python developer
- Specifies exact file structure requirements
- Emphasizes plan adherence
- Template variables: {{PLAN}}
CODER_REFINEMENT_PROMPT
- Used when initial code fails tests
- Provides original plan, previous code, and error report
- Instructs to output complete corrected files
- Template variables: {{PLAN}}, {{PREVIOUS_CODE}}, {{TEST_REPORT}}
TESTER_PROMPT
- Instructs Gemini to act as QA engineer
- Defines five-step analysis process
- Specifies JSON response format with success flag
- Template variables: {{PLAN}}, {{CODE_TO_TEST}}

Design Philosophy: Prompts are designed to produce structured, parseable JSON outputs while ensuring each agent understands its role and constraints.

`config.py`

Purpose: Secure storage for API credentials

Configuration:

API_KEYS = {
    "google": "YOUR_GOOGLE_API_KEY_HERE",
    "openai": "YOUR_OPENAI_API_KEY_HERE",
    "anthropic": "YOUR_ANTHROPIC_API_KEY_HERE"
}

Security Note: Add config.py to .gitignore to prevent committing API keys

Required APIs:

Google Generative AI API (for Gemini)
OpenAI API (for GPT-4)
Anthropic API (for Claude)

`requirements.txt`

Purpose: Python package dependencies

Dependencies:

google-generativeai>=0.8.0  # Google Gemini API
openai>=1.0.0               # OpenAI API
anthropic>=0.34.0           # Anthropic Claude API

🚀 Installation & Setup

Prerequisites

Python 3.9 or higher
API keys from:

Installation Steps

Clone the repository
```
git clone <repository-url>
cd mas
```

Create virtual environment (recommended)

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies
```
pip install -r requirements.txt
```

Configure API keys

Edit config.py and add your API keys:

API_KEYS = {
    "google": "your-actual-google-key",
    "openai": "your-actual-openai-key",
    "anthropic": "your-actual-anthropic-key"
}

Run the system
```
python main.py
```

💡 Usage Example

Input Prompt

"Create a multi-agent research system where one agent searches for recent 
articles on AI safety, another agent summarizes them, and a third agent 
creates a markdown report."

Generated System Structure

generated_research_system/
├── README.md
├── requirements.txt
├── main.py           # Orchestrator for the research system
├── agents.py         # ResearchAgent, SummaryAgent, ReportAgent
└── tools.py          # web_search(), save_to_file()

🔄 Iterative Refinement

The system includes a sophisticated feedback loop:

Code generated by CoderAgent
TesterAgent analyzes and reports issues
If failures detected → Code + Error Report → Refined Code
Maximum 3 iterations before termination
Deploy only on successful validation

This ensures high-quality, working implementations.

🎯 Key Features

Multi-Model Architecture: Leverages best model for each task
Automated Testing: Built-in quality assurance
Iterative Refinement: Self-correcting code generation
Complete Deployment: Ready-to-run generated systems
Extensible Design: Easy to add new agents or models

🤝 Contributing

Contributions welcome! Please ensure:

Code follows PEP 8 style guidelines
All existing tests pass
New features include appropriate tests
Documentation is updated