Traditional Retrieval-Augmented Generation (RAG) systems, while effective for direct question-answering, often struggle with complex, multi-step queries that require planning, decomposition, and synthesis of information from diverse sources. This paper introduces Synapse, a collaborative multi-agent system designed to overcome these limitations in the domain of financial and policy research. Synapse employs a team of three specialized AI agentsβa Research Manager, a Search Specialist, and a Financial Analystβorchestrated by the LangGraph framework. By assigning distinct roles and tools to each agent and managing their interaction through a shared state object, the system can autonomously decompose a complex query, execute a targeted information-gathering strategy, and synthesize a comprehensive, data-driven report. We demonstrate through experimental evaluation that this structured, multi-agent approach provides significantly more reliable, comprehensive, and factually grounded results compared to a monolithic, single-agent RAG baseline. Furthermore, we discuss the critical design evolution from a conversational agent flow to a more robust, programmatic workflow, highlighting key lessons in building reliable agentic systems.
The proliferation of Large Language Models (LLMs) has led to the development of powerful information retrieval systems, with Retrieval-Augmented Generation (RAG) being a predominant architecture (Lewis et al., 2020). RAG enhances LLM responses by grounding them in external knowledge, mitigating hallucinations and providing access to up-to-date information. However, standard RAG implementations typically operate as a monolithic process: a single agent receives a query, retrieves relevant documents, and generates a response. This single-threaded approach proves brittle when faced with complex queries that necessitate a multi-stage research process, such as, "How has Apple's recent product announcement affected its stock price and news coverage?"
Such queries demand a sequence of distinct cognitive tasks:
This paper presents Synapse, a multi-agent system designed to explicitly model this research workflow. Synapse leverages a team of specialized agents, each responsible for one phase of the process, to collaboratively solve complex financial and policy queries. Our contributions are threefold:
Our work is situated at the intersection of Retrieval-Augmented Generation, Multi-Agent Systems, and Agentic AI.
Retrieval-Augmented Generation (RAG): The concept of augmenting LLMs with external knowledge is well-established. However, advanced RAG techniques have begun to explore more complex, iterative processes, such as generating multiple queries or self-correcting retrieved documents (Gao et al., 2023), which hints at the need for more structured workflows.
Multi-Agent Systems (MAS): The use of multiple interacting agents to solve problems is a core concept in AI. In the context of LLMs, frameworks like AutoGen (Wu et al., 2023) and CrewAI have popularized the idea of "society of agents." These frameworks facilitate complex task execution through conversational agent interaction. Synapse builds on this paradigm but utilizes LangGraph to impose a more explicit, stateful, and directed graph structure on the agent collaboration, which we argue is critical for reliability.
Agentic AI and Tool Use: Modern agentic systems are defined by their ability to use external tools to extend their capabilities. The ReAct (Reason and Act) framework demonstrated how LLMs can interleave reasoning with tool use to solve problems (Yao et al., 2022). Our agents are fundamentally tool-users, but our work highlights a key challenge: ensuring agents use tools reliably rather than merely "talking about" using them.
The Synapse system is designed as a stateful graph where nodes represent agents and edges represent the flow of control. The entire workflow is orchestrated by LangGraph, with a shared AgentState
object serving as the persistent memory and data-passing mechanism between agents.
The workflow proceeds through a sequence of three specialized agents (Figure 1).
[User Query] -> [1. Research Manager] -> [2. Search Specialist] -> [3. Financial Analyst] -> [Final Report]
| | | |
+---------------> (AgentState Object is Updated at Each Step) <-----------------+
Figure 1: The sequential workflow of the Synapse multi-agent system.
The AgentState
is a Python TypedDict
that contains fields such as original_query
, research_plan
, and final_report
, allowing each agent to access the work of its predecessors.
Research Manager: This agent acts as the team lead. It receives the user's raw query and its sole responsibility is to decompose it into a machine-readable plan. Initially designed to produce a natural language plan, this proved unreliable. The final implementation uses the LLM's structured output capabilities to generate a Pydantic object containing a list of concise search queries and a list of identified stock tickers. This change from conversational output to structured data was critical for system reliability.
Search Specialist: This agent is a pure executioner. It takes the research_plan
from the state object and systematically calls its assigned tools to populate a central ChromaDB vector store. Its tools include the Tavily Web Search API and the GDELT API for broad, real-time news gathering. Its output is a simple confirmation message, as its primary contribution is the side-effect of populating the database.
Financial Analyst: This agent is the final synthesizer. A critical lesson was learned in its design. An initial implementation as a general tool-calling agent often resulted in "lazy" behavior, where the agent would generate a generic report without calling its tools. The final, robust implementation is a programmatic function that first makes explicit, non-negotiable calls to its tools (Vector Database Search
and Stock Price Tool
) to gather all necessary context. Only after this context is gathered is it passed to an LLM with a highly-focused prompt to synthesize the final report. This removes the decision of tool use from the LLM, forcing a deterministic and reliable workflow.
Synapse integrates a suite of tools to empower its agents:
To evaluate the effectiveness of Synapse, we compared its performance against a baseline single-agent RAG system on three distinct query types.
The comparative results are summarized in Table 1.
Query | System | Reliability | Comprehensiveness | Factuality | Outcome |
---|---|---|---|---|---|
Q1 (Broad) | Baseline | Low | Low | Low | Failed to call search tool; generated generic, hallucinatory response. |
Synapse | High | High | High | Successfully searched for news and synthesized a relevant summary. | |
Q2 (Specific) | Baseline | Low | Low | Low | Claimed it could not fetch commodity prices, even with a web search tool. |
Synapse | High | High | High | Correctly used web search to find recent news on Brent crude prices. | |
Q3 (Hybrid) | Baseline | Low | Low | Low | Failed to orchestrate the two required steps (stock lookup and news search). |
Synapse | High | High | High | Successfully fetched the NVDA stock price and news, then synthesized them into a single report. |
Table 1: Comparative performance of the Baseline and Synapse systems.
The baseline single-agent system consistently failed on all but the simplest queries. It exhibited "lazy" behavior, often opting to generate a generic response from its parametric memory rather than undertaking the complex process of using its tools. In contrast, Synapse reliably executed the necessary steps for all three queries, producing comprehensive and factually grounded reports.
The experimental results strongly indicate that for complex, multi-step information retrieval tasks, a structured multi-agent architecture is superior to a monolithic one. The division of labor allows each agent to excel at its specific task.
The most significant finding of this work, however, was the critical importance of workflow determinism. Our initial design, which relied on conversational handoffs between intelligent agents, was brittle. The FinancialAnalyst
agent's failure to reliably call its tools exposed a fundamental challenge in agentic AI: LLMs will often take the path of least resistance. By redesigning the analyst's node to be a programmatic function that forces tool execution before LLM synthesis, we shifted from a probabilistic workflow to a deterministic one, dramatically increasing system reliability.
Limitations: The system's performance is still heavily dependent on the quality of its underlying tools and the clarity of the ResearchManager
's output. Furthermore, the sequential nature of the graph can lead to longer processing times.
This paper presented Synapse, a multi-agent system that demonstrates a robust and effective approach to complex financial research queries. By decomposing the problem and assigning specialized agents to each sub-task within a stateful, graph-based framework, Synapse overcomes the limitations of traditional single-agent RAG systems. Our key contribution is the empirical finding that enforcing a deterministic, programmatic workflow, especially in the final synthesis stage, is crucial for building reliable and predictable agentic systems. Future work will explore adding human-in-the-loop validation steps and incorporating more diverse agent roles, such as data visualization specialists.
I'd would like to thank the developers of the LangChain and LangGraph libraries, whose open-source tools were foundational to this project. We also acknowledge the providers of the Tavily, GDELT, and Google Gemini APIs for making their services available.
FinancialAnalyst
Synthesis Promptreport_prompt = ChatPromptTemplate.from_template( """You are a master financial analyst. Your task is to write a clear, concise, and insightful report based *only* on the provided context. Here is the user's original query: {original_query} Here is the context you have gathered from your tools: --- News Context: {context} --- Stock Market Context: {stock_context} --- Synthesize this information into a final report. Do not mention your tools or the context directly. Just provide the report. If the context is empty or unhelpful, state that you could not find relevant information. """ )
AgentState
Definition# In graph.py from typing import TypedDict, Dict, List class AgentState(TypedDict): """ A shared state object passed between agents in the graph. """ original_query: str research_plan: Dict search_summary: str final_report: str