https://github.com/xiongQvQ/multi_agent_demo
https://youtu.be/9Mc32O3HoPQ
This project builds a compact, production‑minded multi‑agent workflow for research, analysis, and
reporting using LangGraph (orchestration) and Google Gemini (reasoning). Three specialized agents
—Researcher, Analyst, Reporter—coordinate via shared state and call vetted tools: web search (with
graceful offline fallback), calculator, and a safe file writer. Guardrails enforce prompt isolation,
output redaction, and restricted file paths. A CLI and Streamlit UI support reproducible runs that
generate professional markdown reports under a configurable output directory.
Modern research requires reliable synthesis across noisy sources with strong safety controls.
This submission demonstrates an extensible system that turns open‑ended prompts into structured,
auditable outputs for ReadyTensor’s agentic AI context. We decompose roles to improve clarity and
control, encapsulate side effects in tools, and use LangGraph to make state transitions explicit. The
objective is end‑to‑end value: consistent search, quantitative reasoning, and report generation with
reproducibility and defense‑in‑depth safeguards.
The workflow is a linear LangGraph of three nodes. Researcher extracts keywords and queries a
Serper‑backed search tool, falling back to curated results when the API is absent. Analyst reviews
findings, identifies needed calculations (e.g., P/E, percentage change), and uses a sandboxed
calculator. Reporter synthesizes the narrative and persists a markdown report through a constrained
file tool. Gemini 2.0 Flash handles reasoning. OutputFilter redacts sensitive strings and caps length;
file writes are confined to OUTPUT_DIR.
We exercised the system with representative prompts: “Analyze Tesla’s stock performance,” “Research
Microsoft’s latest quarterly earnings,” and mixed‑language queries. Two modes were covered: offline
(fallback search) and online (Serper). We verified state handoff between agents, tool interoperability,
safety filters, and deterministic testability via injectable LLM/tools. We also validated both the CLI
and Streamlit paths, ensuring reports are created, readable, and stored under the configured output
directory.
The workflow produced well‑structured summaries, analytical insights, and final markdown reports saved
to outputs/. Tool calls were isolated and resilient: network failures gracefully triggered fallback
search; the calculator handled arithmetic and common finance patterns while rejecting code injection;
the file tool enforced extensions and path safety. The Streamlit app surfaced progress and results
clearly. The design supported reproducible runs and easy substitution of models or tools for tuning and
future extensions.
This demo shows a robust pattern for agentic research: clear roles, explicit orchestration, narrow
tools, and layered safety. It is ready for extension with retrieval, memory, richer financial models,
or parallel branches in LangGraph. For production hardening, we recommend deeper evaluations, latency
and error metrics, secrets management, and CI that prioritizes offline tests. The included Dockerfile
and GitHub Actions provide deployment scaffolding and a path to scale the system responsibly.