The Multi-Agent Insurance Policy Cancellation System is an AI-powered application that automates insurance policy cancellation workflows using a coordinated agentic architecture.
This system uses Large Language Models (LLMs), tool integrations, validation guardrails, test strategies and workflow orchestration to process user requests safely and generate structured cancellation outcomes properly. Using mock data stored in CSV files, multiple specialized agents collaborate to validate policy details, evaluate cancellation eligibility based on predefined business rules, calculate refundable amounts, log refund records, and generate a professional cancellation notice in PDF format.
The system is deployed with a Streamlit UI, which makes the system more user-friendly and interactive.
This system is an AI-driven, multi-agent workflow engine that processes insurance cancellation requests in a structured, validated, and secure manner.
Traditional insurance cancellation system involves a lot of manual efforts and requires manual data verification, also increases high risks of inconsistent validation. Using this agentic AI system can help solve automated structured intake of cancellation requests, intelligent validation and policy verification, risk and compliance checks, safe and auditable decision workflows.

This system uses a multi-agent, tool-augmented LLM architecture designed to safely automate structure insurance cancellation workflows.
Rather than relying on a single monolithic prompt, the system decomposes the workflow into role-specialised agents, deterministic routing login, tool-augmented reasoning, and guardrails-enforced structured outputs.
To achieve the success of policy cancellation, the process is determined using the following rules:
Only policies that satisfy all rules are eligible for cancellation and refund processing.
This project uses mock data, all customer and policy data is stored as a CSV file named insurance_policies.csv. Link for the dataset is below:
https://github.com/jingozuo/AAIDC_Project3_JZ/blob/main/data/insurance_policies.csv
This custom data lookup tool is used to search and validate the existence of the policy by query a mock CSV dataset.
The tool is designed to evaluate whether a policy is eligible for cancellation based on the policy's status, payment and dates.
Once the policy is confirmed that is eligible for refund, this tool is used to compute refund amount from policy dates and payment only.
This tool is helpful to append one new refund record to the output CSV file as an evidence.
The last step for processing insurance cancellation is to generate a PDF notice with all relevant details for the user.
It's used to ask the user to provide a policy number and confirm whether they want to proceed with cancelling the policy. The output is in JSON format, following the schema:
{ "policy_number": "string", "first_name": "string", "last_name": "string", "start_date": "date", "end_date": "date", "policy_status": "string", "payment_amount": "string", "is_payment_made": "boolean" }
It's responsible for generating a clear and polite cancellation summary for the customer. The summary includes:
Intake agent is responsible for collecting policy number, looking up policies in data source, and confirming whether the policy detail is correct with user by using Data Lookup Tool and Intake Assistant Prompt. It involves loops until policy is found and confirmed or max attempts.
If the policy number provided can be found in the mock dataset, the agent will retrieve the policy details for the customer and confirm the customer if policy details are correct or not. If the policy number can't be found, the agent will request the customer to enter the correct policy number.
Analysis agent is responsible for determining if the policy is eligible for cancellation. It uses Check Cancellation Eligibility Tool to to check whether the policy is active, whether the payment has been made, and whether the current date is before the policy end date.
Refund agent can help automatically compute refund amount from the policy by using Refund Calculator Tool. Refund are calculated using a pro-rata approach based on the remaining policy duration and the total payment amount.
Logger agent persists approved refund decisions into a CSV file stored in the output directory. This agent uses Refund Log Tool, a custom CSV logging tool, and is executed only after human approval of the refund. Logged data includes:
Summary agent generates cancellation notice from the policy and refund details. It uses Summary Assistance Prompt to produce well-structured, formal language suitable for customer communication. The generated notice is then exported as a PDF file by using the custom Notice Generator Tool. It represents the final customer-facing output of the system
The system also involves human-in-the-loop interactions as key decision boundaries within the flow to enhance safety, accountability, and trust. There're two human-in-the-loop checkpoints implemented in the workflow.
This checkpoint is after the Analysis Agent. A human reviewer examines the eligibility decision and explicitly approves or rejects the policy cancellation. Once the human reviewer approves the eligibility check, the workflow can proceed to Refund Agent, otherwise, the workflow terminates safely.
A human review checkpoint is added after Refund Agent, ensuring that financial outcomes are verified before any permanent record is created. Only when the refund is approved does the workflow proceed to Logger Agent, otherwise, the workflow terminates safely.
DeepEval method is used in this project to evaluate the insurance cancellation workflow. Evaluation includes five dimensions:
The system implements a multi-layered safety architecture combining:
The system enforces strict behavioral constraints directly inside prompt_config.yaml.
From intake_assistant_prompt:
From summary_assistant_prompt:
The system implement input validation and sanitization inside guardrails_safety.py.
LLM-generated summaries are validated using validate_notice_output().
Validations performed:
Safe Fallback Mechanism
If output is empty, invalid type, exception, or failure validation, the system returns:
"Your insurance cancellation has been processed. Please retain this notice for your records."
This can help prevent the system crash, and ensure no exposure of partial or corrupted LLM output.
All guardrail events are logged in file guardrails_compliance.jsonl.
Example:
{"timestamp": "2026-03-01T22:23:31.894994+00:00", "event_type": "output_validation", "stage": "summary", "message": "Notice text validated", "validated": true}
The system implements a structured, The system implements a structured, production-grade testing strategy using pytest, organized into:
The test suite is modular and maps directly to system components.
conftest.py run_tests.py test_e2e_system_flows.py test_integration_agent_communication.py test_nodes.py test_prompt_builder.py test_tools_cancellation_rules.py test_tools_data_lookup.py test_tools_notice_generator.py test_tools_refund_calculator.py test_tools_refund_logger.py
The test validates each individual agent node functions, execution logic, output schema compliance, proper state transitions, and failure handing inside nodes. This test ensures each agent produces structured format, and invalid inputs are rejects.
Result:

The test validates whether prompt templates are correctly formatted, whether required system instructions are included, and whether JSON schema constraints are injected. The purpose is to prevent prompt drift and regression when modifying templates.
Result:

Each custom tool is tested independently:
The test validates orchestrator routing logic, agent-to-agent handoffs, tool invocation sequencing, state passing consistency, multi-agent chain, HITL resume handoffs. This test ensures agents communicate correctly, no schema corruption across steps, and deterministic routing logic flows.
Result:
End-to-end test runs the full insurance cancellation graph (intake → analysis → HITL → refund → HITL →
logger → summary) with mocked user input, HITL decisions, and LLM. Verifies that
the entire flow completes and produces expected outputs (refund log, PDF notice).
Result:

git clone https://github.com/jingozuo/AAIDC_Project3_JZ.git
python -m venv venv source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
Set the llm_model in config/config.yaml to match your provider (e.g. llama-3.3-70b-versatile for Groq).
llm_model: llama-3.3-70b-versatile
From the project root:
python codes/main.py
Or from the codes directory:
cd codes && python main.py
Run the same workflow in a browser:
streamlit run codes/streamlit_app.py
Tests use pytest. Run from the project root:
pytest tests/test_e2e_system_flows.py
Run a single test, e.g. test_tools_data_lookup.py
pytest tests/test_tools_data_lookup.py
The system uses Streamlit to provide an interactive conversational workflow. The UI follows a step-based stateful flow, where the session state determines the next prompt and available user actions.
The interface designed includes:

A demonstration video is attached to illustrate the full interaction flow.
The workflow proceeds as follows:
The system implements a centralized retry and logging mechanism to ensure production-grade resilience and full operational traceability.
All external-risk operations including tool calls and LLM invocations are wrapped using: call_with_retry(). This method ensures automatic retry on transient failure, exponential backoff, controlled retry limits, structured JSON logging, and safe fallback behaviour.
All operations use:
If all attempts fail, the retry is marked as exhausted and fallback logic is executed.
Retry Attempt - When an operation fails but retry attempts remain:
Retry Exhausted - If all attempts fail:
All retry-related logs are written to: logs/guardrails_compliance.jsonl.
This is the same structured log file used by guardrails validation events. This design ensures:
Possible Issue 1: LLM Timeout
Causes: High latency or API overload
Solutions:
Possible Issue 2: Guardrails Validation Failure
Causes: Output not matching schema
Solutions:
Possible Issue 3: UI Not Rendering
Causes: Output not matching schema
Solutions:
The Multi-Agent Insurance Cancellation System is a production-ready AI workflow designed to automate policy cancellation in a safe, structured, and auditable manner. The system combines modular multi-agent orchestration, strict JSON schema enforcement, Guardrails-based validation, and centralized retry and logging mechanisms to ensure reliability and compliance.
By enforcing deterministic routing, input sanitization, output filtering, and exponential backoff retry logic, the system minimizes hallucination risk, prevents malformed outputs, and gracefully handles failures without disrupting the user experience. All critical events — including validation checks, retry attempts, and fallback executions — are captured in structured compliance logs to support traceability and operational monitoring.
Comprehensive unit, integration, and end-to-end testing further ensures workflow integrity and resilience under real-world conditions.
Overall, the system demonstrates a robust approach to building secure, maintainable, and enterprise-ready agentic AI applications, prioritizing safety, reliability, and transparency over uncontrolled autonomy.