Automating Enterprise Procurement with a Multi-Agent AI System

How four specialized AI agents work together to streamline supplier search, cost analysis, and purchasing decisions — built with LangGraph and Claude

TL;DR

Procurement teams in mid-to-large enterprises spend significant time manually researching suppliers, validating purchase requests, and preparing recommendations — often repeating the same structured process for every new requisition. Procurement Assistant is an early-stage, multi-agent AI system designed to automate this cycle. Built with LangGraph and the Anthropic Claude API, it chains four specialized agents — Intake, Procurement, Analyst, and Orchestrator — each with a clearly defined role and a strict output contract. The result is a fully traceable, policy-driven procurement workflow that goes from a free-text purchase request to a structured final report with a supplier recommendation, cost analysis, and a plain-language decision summary.

1. Problem Context

Enterprise procurement is one of those domains where the process is well-understood, highly structured, and yet still largely manual. In most companies, a purchase request follows a predictable path:

Someone submits a request (often informally — email, chat, a form).
A procurement specialist validates it against company policy.
They research potential suppliers, compare pricing, and assess risk.
A decision is made: approve, escalate, or reject.
A report or summary is sent back to the requester.

The challenge is not that this process is complex — it is actually very structured. The challenge is that it is time-consuming, repetitive, and heavily dependent on institutional knowledge: knowing the buying rules, knowing the thresholds, knowing which supplier categories apply to which product types.

This is precisely where AI agents can add value. Not by replacing procurement specialists, but by automating the structured, rule-driven parts of the workflow — so that human attention is focused where it genuinely matters: exceptions, strategic decisions, and supplier relationships.

Procurement Assistant is an attempt to build exactly that. It is a first-version system targeting the core intake-to-decision cycle for procurement requests involving machinery, equipment, vehicles, professional services, and other categories defined in a company's purchasing policy.

2. Technical Requirements and Design Principles

Before diving into the architecture, it is worth stating the design principles that shaped the system — because they explain many of the implementation choices.

Auditability above all. Every decision made by every agent must reference the specific rule that drove it. A compliance officer must be able to read the decision log and understand the full reasoning chain without external context.

Strict output contracts. Agents do not produce free-form text. Every agent output is a structured Pydantic model, validated at runtime. If an agent cannot produce valid structured output, the system raises a typed exception (StructuredOutputError) rather than passing malformed data downstream.

Separation of concerns. Each agent does exactly one thing. The Intake Agent does not search for suppliers. The Procurement Agent does not make final decisions. The Orchestrator does not override upstream decisions. This makes the system easier to test, debug, and improve incrementally.

Configuration over hardcoding. Buying rules, decision thresholds, model selection, and prompt versions are all managed through YAML configuration files. Changing a threshold does not require touching agent code.

Prompt versioning. System prompts are versioned, checksummed, and changelog-tracked — the same discipline applied to code.

3. System Architecture

The system is built as a directed agent graph using LangGraph. Each agent is a separate node in the graph. Nodes communicate exclusively through a shared state object (SharedState) — no agent calls another directly.

  User Input (CLI)
        │
        ▼
┌──────────────┐
│ Intake       │  Validates & classifies the request.
│ Agent        │  Triggers clarification rounds if required fields are missing.
└──────┬───────┘
       │  validated_request, category_id, process_type
       ▼
┌──────────────┐
│ Procurement  │  Searches for suppliers using web search, PDF parsing,
│ Agent        │  and currency conversion tools.
└──────┬───────┘
       │  supplier_recommendations, procurement_strategy, negotiation_points
       ▼
┌──────────────┐
│ Analyst      │  Performs TCO analysis and risk assessment.
│ Agent        │  Issues final decision based on configurable thresholds.
└──────┬───────┘
       │  cost_analysis, risk_analysis, final_decision
       ▼
┌──────────────┐     ┌──────────────┐
│ Orchestrator │────►│ Human Review │  Activated when decision = ESCALATE
│ Agent        │     │ Node         │
└──────┬───────┘     └──────────────┘
       ▼
  Final Report + Plain-Language Summary (CLI)

The graph is defined in procurement_system/graph/procurement_graph.py. The shared state flows through each node, accumulating outputs and a running decision log that is included in the final report.

4. The Four Agents

Intake Agent — The Gatekeeper

The Intake Agent is the entry point of the system. It receives a raw, free-text purchase request and is responsible for transforming it into a validated, classified, and routed procurement record.

Its core tasks are:

Validate required fields: description, quantity, unit. If any are missing, it does not guess or infer — it triggers a clarification round.
Classify the request into a procurement category (e.g., MACHINERY, IT_SOFTWARE, PROFESSIONAL_SERVICES) based on the enterprise buying rules.
Determine the process type: catalog_purchase, rfq, formal_rfq, or strategic_sourcing — driven entirely by the rules defined in enterprise_buying_rules.yaml.
Route the request with a justification that cites the specific rule applied.

Interactive Clarification via GraphInterrupt

One of the more interesting implementation details here is how clarification is handled. When required fields are missing, the Intake Agent does not just flag an error — the graph raises a GraphInterrupt, which pauses execution and surfaces a structured question to the CLI. The user answers, and the graph resumes via Command(resume=answer).

Python controls the clarification loop — how many rounds are allowed, and when to proceed with flags if the user does not respond. The agent only formulates the question; it does not track rounds or decide when to stop.

except GraphInterrupt as e:
    interrupt_obj = e.args[0][0]
    payload = interrupt_obj.value
    question = payload.get("question", "Please provide the missing information")
    answer = input("> ").strip()
    final_state = graph.invoke(Command(resume=answer), config=config)

The tone of the clarification question adapts based on the round number — polite and open-ended in round 1, specific and urgent in the final round. This is controlled by a parameter injected at runtime, not hardcoded in the prompt.

Procurement Agent — The Sourcing Specialist

The Procurement Agent receives the validated request from the Intake Agent and is responsible for market research and sourcing strategy.

It produces:

1–5 supplier options from genuinely different market segments (local, national, international). The number of options reflects market structure — single-source markets get one option; high-value strategic sourcing requests get up to five.
A purchasing strategy narrative that references the process type and explains why specific segments were chosen.
A recommended order type strictly aligned with the process type received from intake.
3–5 specific, actionable negotiation points — vague points like "negotiate a better price" are explicitly rejected by the prompt.

Each supplier recommendation includes: name, type, estimated price range (total, not per-unit), lead time, reliability score, pros, cons, and contact priority.

Tools

The Procurement Agent has access to three tools:

Tool	Purpose
Supplier Web Search	Searches the web for suppliers matching the request category, powered by Tavily
PDF Reader	Extracts and parses supplier catalogues, offers, and technical documents in PDF format
Currency Converter	Converts supplier price quotes to a common currency for fair comparison

The tool layer is structured in three tiers: tools/ (agent-facing interface), services/ (business logic), and repositories/ (external API calls). This separation makes each layer independently testable.

Analyst Agent — The Decision Maker

The Analyst Agent performs the financial and risk analysis, and issues the final procurement decision.

Its output includes:

Total Cost of Ownership (TCO) analysis:

Estimated unit cost and total estimated cost
Three cost scenarios: optimistic, realistic, pessimistic
Budget adequacy assessment and potential savings

Risk assessment:

3–5 identified risks across categories: financial, operational, supplier, quality, delivery, compliance
A composite risk_score calculated as the weighted average of probability × impact for each risk, clamped to a 1.0–10.0 scale

Final decision, determined by configurable thresholds:

# config/config_analyst_agent.yaml
decision_thresholds:
  auto_proceed: 3.0    # risk_score ≤ 3.0 → PROCEED
  auto_escalate: 7.5   # risk_score ≥ 7.5 → ESCALATE

The four possible decisions are:

Decision	Meaning
`PROCEED`	Risk and budget within thresholds — approved
`PROCEED_WITH_CONDITIONS`	Approved, with specific conditions to satisfy before signing
`ESCALATE`	Requires human review — risk or value exceeds thresholds
`REJECT`	Request is fundamentally unfeasible

The risk score formula is deterministic — given the same inputs, the same score is always produced. This is intentional: auditability requires reproducibility.

Orchestrator Agent — The Communicator

The Orchestrator is the final node in the graph. It makes no new decisions — it synthesises and communicates everything the upstream agents have produced.

It generates two outputs:

A structured final report (LLMFinalReport) containing the full TCO analysis, risk assessment, supplier recommendation, conditions, next steps, and the complete decision log from all agents.
A plain-language message for the requester (maximum 150 words, zero technical jargon). Terms like "TCO", "risk_score", or "rfq" are explicitly prohibited in this output. The message tells the requester: what was decided, who the recommended supplier is, what the cost looks like, and what the next concrete steps are.

If the Analyst Agent has issued an ESCALATE decision, the graph routes through the Human Review Node before producing the final report — a deliberate circuit breaker to ensure high-risk purchases always involve a human.

5. Key Implementation Details

Shared State as the Single Source of Truth

All agents read from and write to a single SharedState TypedDict. This means there is no message-passing between agents — only state transitions. The decision log is an append-only list that every agent writes to, producing a complete audit trail by the time the Orchestrator runs.

Prompt Architecture

Each agent has two prompts: a system prompt (defining the agent's role, reasoning approach, and output rules) and a user prompt (injected at runtime with the current state fields the agent needs). The user prompt is assembled programmatically by utils/prompt_assembler.py, which injects structured data from SharedState into a template.

This separation means the system prompt defines behaviour and the user prompt delivers data — a clean contract that makes prompts easier to test and version independently.

Output Schema Enforcement

Agent outputs are enforced via Pydantic using LangGraph's .with_structured_output(). If the LLM produces output that does not match the schema, the system raises StructuredOutputError — a typed exception that is caught at the top level in main.py and handled gracefully, returning control to the user for a new request without crashing the graph.

Model Registry

The model used by each agent is configured independently in config/model_registry.yaml. This means different agents can run on different Claude models — for example, a lighter model for the Orchestrator (which mostly reformats existing data) and a more capable model for the Analyst (which requires deeper reasoning).

6. Project Structure

The codebase follows a layered architecture:

procurement_system/
├── agents/          ← Agent logic (one subdirectory per agent)
├── config/          ← All YAML configuration (buying rules, thresholds, models)
├── graph/           ← LangGraph graph definition
├── nodes/           ← Graph nodes (one per agent + human_review)
├── prompts/         ← System and user prompts (versioned, checksummed)
├── repositories/    ← External API calls (Tavily, currency, PDF)
├── schemas/         ← Pydantic models for all agent inputs and outputs
├── services/        ← Business logic layer
├── tools/           ← Agent-facing tool interfaces
└── utils/           ← Shared utilities (prompt loading, LLM setup, logging)

tests/
├── agents/          ← Agent-level tests
├── nodes/           ← Node-level tests
├── repositories/    ← Repository-level tests
└── services/        ← Service-level tests

The test suite mirrors the source structure exactly, with coverage at every layer.

7. Running the System

The system runs as a command-line application:

python main.py

A sample session:

Provide purchase requisition (or exit):
> 2 units CNC milling machine 3-axis, budget ~90000 USD, delivery by September 2026

📄 RAPORT
============================================================
Decision: PROCEED_WITH_CONDITIONS
Message: Your request for 2 CNC milling machines has been reviewed.
         We recommend TechMachinery GmbH as the primary supplier,
         with a total estimated cost of $82,000–$94,000.
         Before signing, please confirm their delivery guarantee
         in writing. Next step: contact TechMachinery GmbH
         to confirm pricing and lead time.
--- Decision log ---
 • [intake] Classified as MACHINERY / formal_rfq — rule: threshold_50k_usd
 • [procurement] 3 supplier options identified
 • [analyst] risk_score: 3.8 — PROCEED_WITH_CONDITIONS
 • [orchestrator] Final report compiled

8. Limitations and Honest Scope

This is a v0.1 system. It is important to be clear about what it does not yet do:

No web interface — interaction is CLI only.
No persistent storage — each session is stateless; there is no database of past requests or decisions.
No ERP or procurement system integration — the system is standalone.
Supplier recommendations are AI-generated — the web search tool finds real results, but supplier data should be verified before acting on it.
Buying rules are static — they must be manually updated in enterprise_buying_rules.yaml; there is no self-learning from past decisions.

These are known limitations, not oversights. The goal of this version was to validate the multi-agent workflow and the agent contracts — not to build a production-ready procurement platform.

9. What Comes Next

The natural next steps for this system, in rough priority order:

Persistent session storage — saving requests, decisions, and reports to a database for reporting and audit purposes.
Web UI — replacing the CLI with a proper interface for procurement teams.
ERP integration — connecting to systems like SAP or Odoo to pull existing supplier data and push approved requests.
Feedback loop — allowing procurement specialists to rate recommendations, so the system can improve buying rule suggestions over time.
Multi-language support — the agents currently operate in English; supporting additional languages would expand the system's applicability.

10. Tech Stack Summary

Component	Technology
Agent framework	LangGraph
LLM provider	Anthropic Claude (via API)
Output validation	Pydantic
Supplier web search	Tavily
PDF parsing	Custom service layer
Currency conversion	Custom service layer
Configuration	YAML
Language	Python 3.10+

Automating Enterprise Procurement with a Multi-Agent AI System

Table of contents

Automating Enterprise Procurement with a Multi-Agent AI System

How four specialized AI agents work together to streamline supplier search, cost analysis, and purchasing decisions — built with LangGraph and Claude

TL;DR

1. Problem Context

2. Technical Requirements and Design Principles

3. System Architecture

4. The Four Agents

Intake Agent — The Gatekeeper

Interactive Clarification via GraphInterrupt

Procurement Agent — The Sourcing Specialist

Tools

Analyst Agent — The Decision Maker

Orchestrator Agent — The Communicator

5. Key Implementation Details

Shared State as the Single Source of Truth

Prompt Architecture

Output Schema Enforcement

Model Registry

6. Project Structure

7. Running the System

8. Limitations and Honest Scope

9. What Comes Next

10. Tech Stack Summary

References

Table of contents

Code

Code