AAIDC Module 2 β Build Your Multi-Agent System.
Repository: https://github.com/phanminhtai23/finsight-multi-agent
FinSight is a production-style, multi-agent system that answers financial questions about any
company β from documents you upload (PDF / Word / scanned images) or from live web sources
β and always answers with inline citations back to the exact source. A LangGraph supervisor
coordinates six specialized agents over a retrieval-augmented-generation (RAG) layer backed by
Qdrant, with tools exposed through a dedicated Model Context Protocol (MCP) server. Long
jobs (document ingestion) run asynchronously so the user can keep chatting. A full React app
wraps it with auth, token streaming, a live "thinking" view, on-the-fly charts, and
per-topic knowledge bases. The whole stack runs with one docker compose up and is verified
end-to-end.
Reading financial reports is slow, and generic chatbots are untrustworthy for finance: they
hallucinate figures and can't show where a number came from. FinSight targets exactly this gap
β grounded, cited answers that combine a user's private documents with live market research.
[n], mapping to a document page (deep-linked on Cloudinary) or a web URL.flowchart TD U["π₯οΈ React UI (Vite + TS)"] -->|"REST Β· SSE Β· WebSocket"| API["β‘ FastAPI<br/>(RESTful Β· SOLID Β· DI)"] subgraph AGENTS["π€ LangGraph multi-agent graph"] direction TB SUP(["π§ Supervisor<br/>triage & route"]) RET["π Retrieval<br/>(RAG)"] MR["π Market Research"] AN["π Analyst"] WR["βοΈ Writer<br/>+ citations"] CR["π Critic<br/>grounding check"] SUP --> RET --> AN --> WR --> CR SUP --> MR --> AN CR -.->|"revise β€2"| AN end API --> SUP subgraph DATA["ποΈ Datastores"] QD[("Qdrant<br/>vectors Β· per-topic")] PG[("Postgres<br/>relational + checkpointer")] RD[("Redis<br/>cache Β· pubsub Β· queue")] end subgraph EXT["π§° Tools & services"] MCP["π MCP server :8001<br/>web_search Β· fetch_url<br/>company_financials Β· calculator"] CL["βοΈ Cloudinary<br/>raw files"] LS["π LangSmith<br/>tracing & eval"] end RET --> QD MR -->|"MCP client"| MCP API --> PG API --> RD API --> WK["β³ ARQ worker<br/>async ingestion"] WK --> CL WK --> QD WK -.->|"progress Β· pubsub"| RD API -.->|"trace"| LS
Request flow. The UI calls FastAPI over REST; chat answers stream back over SSE (tokens,
thinking, citations, charts, tool names) and ingestion progress over WebSocket. FastAPI invokes
the LangGraph graph, whose Supervisor routes to the right agents; the Retrieval agent hits
Qdrant, the Market Research agent calls the MCP server as a client. Uploads are handled
out-of-band by an ARQ worker that parses, chunks, embeds and indexes into Qdrant while the user
keeps chatting.
Design principles: SOLID (thin controllers, business logic in services, data access and tools
behind Protocol interfaces, dependency injection), a RESTful versioned API, ruff + pytest,
and a clean separation between relational state (Postgres + LangGraph checkpointer) and vectors
(Qdrant). Full design: ARCHITECTURE.md.
Six agents, supervisor-coordinated:
| Agent | Role |
|---|---|
| Supervisor | Triage β decide whether the question needs live external web data; route the flow. |
| Retrieval | Hybrid search over the user's documents in Qdrant; returns cited evidence. |
| Market Research | Live web/financial research via the MCP web_search tool. |
| Analyst | Synthesize evidence; compute ratios, comparisons, trends. |
| Writer | Compose the final answer with inline [n] citations. |
| Critic | Verify every claim is grounded and cited; bounce back for a bounded revision loop. |
START β supervisor β retrieval β [market_research if needed] β analyst β writer β critic
β analyst (revise, β€2) | END
State is persisted per conversation thread with LangGraph's AsyncPostgresSaver, so threads are
durable and resumable.
gemini-embedding-2, 3072-d, cosine) fused with a[n] markersA dedicated MCP server (FastMCP, streamable-HTTP) exposes four tools; the Market Research agent
is an MCP client that calls them over the protocol:
web_search β DuckDuckGo web search (no API key)company_financials β focused search for a company's latest resultsfetch_url β fetch and clean a web page's textfinancial_calculator β safe arithmetic evaluation (AST-based)This satisfies both the β₯3 tools requirement and the optional MCP communication enhancement.
Reusable, self-contained capabilities any agent or client can invoke (behind a TextGenerator
port): summarize, translate, fact_check β exposed via GET/POST /api/v1/skills.
evals/run_eval.py benchmarks FinSight against a no-RAG baseline on a labeled set of questions
about the sample report, with LangSmith tracing enabled:
[n] citations.This demonstrates the optional formal evaluation metrics and baseline benchmarking enhancement.
LangGraph Β· LangChain Β· Google Gemini Β· FastAPI Β· Qdrant Β· PostgreSQL Β· Redis Β· ARQ Β·
Cloudinary Β· MCP Β· LangSmith Β· ruff Β· pytest Β· Docker Compose.
cp .env.example .env # set GOOGLE_API_KEY (free: aistudio.google.com/apikey) docker compose up -d --build # postgres, qdrant, redis, mcp, api, worker docker compose exec api alembic upgrade head cd frontend && npm install && npm run dev # β http://localhost:5173
A ready-made report ships in the repo at samples/sample_financial_report.docx
(Nimbus Cloud Inc. FY2024). Sign up (email verification auto-passes in dev mode), create a topic,
upload the sample, then try:
| Ask | What to expect |
|---|---|
What was Q4 2024 revenue and net income? | Grounded [1] citation |
Plot quarterly revenue for 2024 as a chart | A line/bar chart (820 β 910 β 1,015 β 1,180) |
Show revenue breakdown by segment as a pie chart | A pie (Cloud 50% Β· Data 28% Β· AI 22%) |
# benchmark RAG vs. a no-RAG baseline (LangSmith tracing on) docker compose exec api python -m evals.run_eval
The full reviewer walkthrough lives in the repo README.md β Quick demo (for reviewers).
Q: What does NVIDIA do and what is its most recent reported quarterly revenue?
A: NVIDIA is a technology company focused on AI products and data centers [4, 5]. For Q2
fiscal 2026, NVIDIA reported revenue of $46.7 billion, +56% year-over-year [3].Sources: [3] investor.nvidia.com Β· [2] Wikipedia Β· [4] TradingView Β· [5] Yahoo Finance
The Supervisor routed to Market Research (live web needed), which called the MCP web_search tool;
the Writer cited the sources and the Critic approved.
pytest unit tests, ruff-clean, full SOLID layering, runs entirely via Docker Compose.Tags: multi-agent Β· langgraph Β· rag Β· mcp Β· agentic-ai Β· financial-analysis Β·
qdrant Β· fastapi Β· react Β· llm Β· google-gemini Β· hybrid-search Β· contextual-retrieval Β· aaidc