Synapse: The Multi Agent system for Financial Research

Abstract

Traditional Retrieval-Augmented Generation (RAG) systems, while effective for direct question-answering, often struggle with complex, multi-step queries that require planning, decomposition, and synthesis of information from diverse sources. This paper introduces Synapse, a collaborative multi-agent system designed to overcome these limitations in the domain of financial and policy research. Synapse employs a team of three specialized AI agents—a Research Manager, a Search Specialist, and a Financial Analyst—orchestrated by the LangGraph framework. By assigning distinct roles and tools to each agent and managing their interaction through a shared state object, the system can autonomously decompose a complex query, execute a targeted information-gathering strategy, and synthesize a comprehensive, data-driven report. We demonstrate through experimental evaluation that this structured, multi-agent approach provides significantly more reliable, comprehensive, and factually grounded results compared to a monolithic, single-agent RAG baseline. Furthermore, we discuss the critical design evolution from a conversational agent flow to a more robust, programmatic workflow, highlighting key lessons in building reliable agentic systems.

Introduction

The proliferation of Large Language Models (LLMs) has led to the development of powerful information retrieval systems, with Retrieval-Augmented Generation (RAG) being a predominant architecture (Lewis et al., 2020). RAG enhances LLM responses by grounding them in external knowledge, mitigating hallucinations and providing access to up-to-date information. However, standard RAG implementations typically operate as a monolithic process: a single agent receives a query, retrieves relevant documents, and generates a response. This single-threaded approach proves brittle when faced with complex queries that necessitate a multi-stage research process, such as, "How has Apple's recent product announcement affected its stock price and news coverage?"

Such queries demand a sequence of distinct cognitive tasks:

Decomposition and Planning: Identifying the core entities (Apple Inc., stock ticker AAPL) and the sub-questions (product announcement details, stock price analysis, news sentiment).
Specialized Information Gathering: Executing parallel searches for structured data (stock prices) and unstructured data (news articles).
Synthesis and Analysis: Integrating the findings from all sources into a single, coherent analytical report.

This paper presents Synapse, a multi-agent system designed to explicitly model this research workflow. Synapse leverages a team of specialized agents, each responsible for one phase of the process, to collaboratively solve complex financial and policy queries. Our contributions are threefold:

We present the design and implementation of a modular, stateful multi-agent system using LangGraph.
We demonstrate the superiority of this collaborative approach over a single-agent baseline in terms of reliability and comprehensiveness.
We document the critical evolution of agent design from conversational models to programmatic, tool-driven functions to ensure robust and predictable system behavior.

Related work

Our work is situated at the intersection of Retrieval-Augmented Generation, Multi-Agent Systems, and Agentic AI.

Retrieval-Augmented Generation (RAG): The concept of augmenting LLMs with external knowledge is well-established. However, advanced RAG techniques have begun to explore more complex, iterative processes, such as generating multiple queries or self-correcting retrieved documents (Gao et al., 2023), which hints at the need for more structured workflows.

Multi-Agent Systems (MAS): The use of multiple interacting agents to solve problems is a core concept in AI. In the context of LLMs, frameworks like AutoGen (Wu et al., 2023) and CrewAI have popularized the idea of "society of agents." These frameworks facilitate complex task execution through conversational agent interaction. Synapse builds on this paradigm but utilizes LangGraph to impose a more explicit, stateful, and directed graph structure on the agent collaboration, which we argue is critical for reliability.

Agentic AI and Tool Use: Modern agentic systems are defined by their ability to use external tools to extend their capabilities. The ReAct (Reason and Act) framework demonstrated how LLMs can interleave reasoning with tool use to solve problems (Yao et al., 2022). Our agents are fundamentally tool-users, but our work highlights a key challenge: ensuring agents use tools reliably rather than merely "talking about" using them.

Methodology

The Synapse system is designed as a stateful graph where nodes represent agents and edges represent the flow of control. The entire workflow is orchestrated by LangGraph, with a shared AgentState object serving as the persistent memory and data-passing mechanism between agents.

3.1. System Architecture

The workflow proceeds through a sequence of three specialized agents (Figure 1).

[User Query] -> [1. Research Manager] -> [2. Search Specialist] -> [3. Financial Analyst] -> [Final Report]
      |                  |                        |                         |
      +---------------> (AgentState Object is Updated at Each Step) <-----------------+

Figure 1: The sequential workflow of the Synapse multi-agent system.

The AgentState is a Python TypedDict that contains fields such as original_query, research_plan, and final_report, allowing each agent to access the work of its predecessors.

3.2. The Agent Team

Research Manager: This agent acts as the team lead. It receives the user's raw query and its sole responsibility is to decompose it into a machine-readable plan. Initially designed to produce a natural language plan, this proved unreliable. The final implementation uses the LLM's structured output capabilities to generate a Pydantic object containing a list of concise search queries and a list of identified stock tickers. This change from conversational output to structured data was critical for system reliability.
Search Specialist: This agent is a pure executioner. It takes the research_plan from the state object and systematically calls its assigned tools to populate a central ChromaDB vector store. Its tools include the Tavily Web Search API and the GDELT API for broad, real-time news gathering. Its output is a simple confirmation message, as its primary contribution is the side-effect of populating the database.
Financial Analyst: This agent is the final synthesizer. A critical lesson was learned in its design. An initial implementation as a general tool-calling agent often resulted in "lazy" behavior, where the agent would generate a generic report without calling its tools. The final, robust implementation is a programmatic function that first makes explicit, non-negotiable calls to its tools (Vector Database Search and Stock Price Tool) to gather all necessary context. Only after this context is gathered is it passed to an LLM with a highly-focused prompt to synthesize the final report. This removes the decision of tool use from the LLM, forcing a deterministic and reliable workflow.

3.3. Tool Integration

Synapse integrates a suite of tools to empower its agents:

Tavily Web Search: Provides real-time, high-quality web search results.
yfinance: A Python library for fetching up-to-date stock market data.
ChromaDB: A vector database used as the team's shared memory for unstructured news articles.
GDELT Project API: Accesses a global database of news events (used by the Search Specialist).

Experiments

To evaluate the effectiveness of Synapse, we compared its performance against a baseline single-agent RAG system on three distinct query types.

Baseline System: A single LangChain agent equipped with all five tools. It must interpret the query, decide which tools to use in what order, and generate a final report in a single, monolithic chain.
Evaluation Queries:
1. Q1 (Broad): "What are the latest developments in Nigerian politics?"
2. Q2 (Specific): "What is the current price of Brent crude oil?"
3. Q3 (Hybrid): "How has NVIDIA's (NVDA) latest earnings call affected its stock price and recent news coverage?"
Metrics: We used qualitative metrics for evaluation:
- Reliability: Did the system complete the task without errors?
- Comprehensiveness: Did the final report address all sub-questions in the query?
- Factuality: Did the report use the tools to ground its answer in real-time data?

Results

The comparative results are summarized in Table 1.

Query	System	Reliability	Comprehensiveness	Factuality	Outcome
Q1 (Broad)	Baseline	Low	Low	Low	Failed to call search tool; generated generic, hallucinatory response.
	Synapse	High	High	High	Successfully searched for news and synthesized a relevant summary.
Q2 (Specific)	Baseline	Low	Low	Low	Claimed it could not fetch commodity prices, even with a web search tool.
	Synapse	High	High	High	Correctly used web search to find recent news on Brent crude prices.
Q3 (Hybrid)	Baseline	Low	Low	Low	Failed to orchestrate the two required steps (stock lookup and news search).
	Synapse	High	High	High	Successfully fetched the NVDA stock price and news, then synthesized them into a single report.

Table 1: Comparative performance of the Baseline and Synapse systems.

The baseline single-agent system consistently failed on all but the simplest queries. It exhibited "lazy" behavior, often opting to generate a generic response from its parametric memory rather than undertaking the complex process of using its tools. In contrast, Synapse reliably executed the necessary steps for all three queries, producing comprehensive and factually grounded reports.

Discussion

The experimental results strongly indicate that for complex, multi-step information retrieval tasks, a structured multi-agent architecture is superior to a monolithic one. The division of labor allows each agent to excel at its specific task.

The most significant finding of this work, however, was the critical importance of workflow determinism. Our initial design, which relied on conversational handoffs between intelligent agents, was brittle. The FinancialAnalyst agent's failure to reliably call its tools exposed a fundamental challenge in agentic AI: LLMs will often take the path of least resistance. By redesigning the analyst's node to be a programmatic function that forces tool execution before LLM synthesis, we shifted from a probabilistic workflow to a deterministic one, dramatically increasing system reliability.

Limitations: The system's performance is still heavily dependent on the quality of its underlying tools and the clarity of the ResearchManager's output. Furthermore, the sequential nature of the graph can lead to longer processing times.

Conclusion

This paper presented Synapse, a multi-agent system that demonstrates a robust and effective approach to complex financial research queries. By decomposing the problem and assigning specialized agents to each sub-task within a stateful, graph-based framework, Synapse overcomes the limitations of traditional single-agent RAG systems. Our key contribution is the empirical finding that enforcing a deterministic, programmatic workflow, especially in the final synthesis stage, is crucial for building reliable and predictable agentic systems. Future work will explore adding human-in-the-loop validation steps and incorporating more diverse agent roles, such as data visualization specialists.

References

Gao, Y., et al. (2023). RAG 2.0: Making RAG Your Long-Term Memory. arXiv preprint arXiv:2312.10997.
Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems, 33.
Wu, Q., et al. (2023). AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. arXiv preprint arXiv:2308.08155.
Yao, S., et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv preprint arXiv:2210.03629.
LangChain, Inc. (2024). LangChain Documentation. https://python.langchain.com/
LangGraph, Inc. (2024). LangGraph Documentation. https://langchain-ai.github.io/langgraph/

Acknowledgements

I'd would like to thank the developers of the LangChain and LangGraph libraries, whose open-source tools were foundational to this project. We also acknowledge the providers of the Tavily, GDELT, and Google Gemini APIs for making their services available.

Appendix

A.1. Final `FinancialAnalyst` Synthesis Prompt

report_prompt = ChatPromptTemplate.from_template(
    """You are a master financial analyst. Your task is to write a clear, concise, and insightful report based *only* on the provided context.

    Here is the user's original query:
    {original_query}

    Here is the context you have gathered from your tools:
    ---
    News Context: {context}
    ---
    Stock Market Context: {stock_context}
    ---

    Synthesize this information into a final report. Do not mention your tools or the context directly. Just provide the report.
    If the context is empty or unhelpful, state that you could not find relevant information.
    """
)

A.2. `AgentState` Definition

# In graph.py
from typing import TypedDict, Dict, List

class AgentState(TypedDict):
    """
    A shared state object passed between agents in the graph.
    """
    original_query: str
    research_plan: Dict 
    search_summary: str
    final_report: str

Abstract

Introduction

Such queries demand a sequence of distinct cognitive tasks:

Decomposition and Planning: Identifying the core entities (Apple Inc., stock ticker AAPL) and the sub-questions (product announcement details, stock price analysis, news sentiment).
Specialized Information Gathering: Executing parallel searches for structured data (stock prices) and unstructured data (news articles).
Synthesis and Analysis: Integrating the findings from all sources into a single, coherent analytical report.

We present the design and implementation of a modular, stateful multi-agent system using LangGraph.
We demonstrate the superiority of this collaborative approach over a single-agent baseline in terms of reliability and comprehensiveness.
We document the critical evolution of agent design from conversational models to programmatic, tool-driven functions to ensure robust and predictable system behavior.

Related work

Our work is situated at the intersection of Retrieval-Augmented Generation, Multi-Agent Systems, and Agentic AI.

Methodology

3.1. System Architecture

The workflow proceeds through a sequence of three specialized agents (Figure 1).

[User Query] -> [1. Research Manager] -> [2. Search Specialist] -> [3. Financial Analyst] -> [Final Report]
      |                  |                        |                         |
      +---------------> (AgentState Object is Updated at Each Step) <-----------------+

Figure 1: The sequential workflow of the Synapse multi-agent system.

The AgentState is a Python TypedDict that contains fields such as original_query, research_plan, and final_report, allowing each agent to access the work of its predecessors.

3.2. The Agent Team

Research Manager: This agent acts as the team lead. It receives the user's raw query and its sole responsibility is to decompose it into a machine-readable plan. Initially designed to produce a natural language plan, this proved unreliable. The final implementation uses the LLM's structured output capabilities to generate a Pydantic object containing a list of concise search queries and a list of identified stock tickers. This change from conversational output to structured data was critical for system reliability.
Search Specialist: This agent is a pure executioner. It takes the research_plan from the state object and systematically calls its assigned tools to populate a central ChromaDB vector store. Its tools include the Tavily Web Search API and the GDELT API for broad, real-time news gathering. Its output is a simple confirmation message, as its primary contribution is the side-effect of populating the database.
Financial Analyst: This agent is the final synthesizer. A critical lesson was learned in its design. An initial implementation as a general tool-calling agent often resulted in "lazy" behavior, where the agent would generate a generic report without calling its tools. The final, robust implementation is a programmatic function that first makes explicit, non-negotiable calls to its tools (Vector Database Search and Stock Price Tool) to gather all necessary context. Only after this context is gathered is it passed to an LLM with a highly-focused prompt to synthesize the final report. This removes the decision of tool use from the LLM, forcing a deterministic and reliable workflow.

3.3. Tool Integration

Synapse integrates a suite of tools to empower its agents:

Tavily Web Search: Provides real-time, high-quality web search results.
yfinance: A Python library for fetching up-to-date stock market data.
ChromaDB: A vector database used as the team's shared memory for unstructured news articles.
GDELT Project API: Accesses a global database of news events (used by the Search Specialist).

Experiments

To evaluate the effectiveness of Synapse, we compared its performance against a baseline single-agent RAG system on three distinct query types.

Baseline System: A single LangChain agent equipped with all five tools. It must interpret the query, decide which tools to use in what order, and generate a final report in a single, monolithic chain.
Evaluation Queries:
1. Q1 (Broad): "What are the latest developments in Nigerian politics?"
2. Q2 (Specific): "What is the current price of Brent crude oil?"
3. Q3 (Hybrid): "How has NVIDIA's (NVDA) latest earnings call affected its stock price and recent news coverage?"
Metrics: We used qualitative metrics for evaluation:
- Reliability: Did the system complete the task without errors?
- Comprehensiveness: Did the final report address all sub-questions in the query?
- Factuality: Did the report use the tools to ground its answer in real-time data?

Results

The comparative results are summarized in Table 1.

Query	System	Reliability	Comprehensiveness	Factuality	Outcome
Q1 (Broad)	Baseline	Low	Low	Low	Failed to call search tool; generated generic, hallucinatory response.
	Synapse	High	High	High	Successfully searched for news and synthesized a relevant summary.
Q2 (Specific)	Baseline	Low	Low	Low	Claimed it could not fetch commodity prices, even with a web search tool.
	Synapse	High	High	High	Correctly used web search to find recent news on Brent crude prices.
Q3 (Hybrid)	Baseline	Low	Low	Low	Failed to orchestrate the two required steps (stock lookup and news search).
	Synapse	High	High	High	Successfully fetched the NVDA stock price and news, then synthesized them into a single report.

Table 1: Comparative performance of the Baseline and Synapse systems.

Discussion

Conclusion

References

Gao, Y., et al. (2023). RAG 2.0: Making RAG Your Long-Term Memory. arXiv preprint arXiv:2312.10997.
Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems, 33.
Wu, Q., et al. (2023). AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. arXiv preprint arXiv:2308.08155.
Yao, S., et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv preprint arXiv:2210.03629.
LangChain, Inc. (2024). LangChain Documentation. https://python.langchain.com/
LangGraph, Inc. (2024). LangGraph Documentation. https://langchain-ai.github.io/langgraph/

Acknowledgements

Appendix

A.1. Final `FinancialAnalyst` Synthesis Prompt

report_prompt = ChatPromptTemplate.from_template(
    """You are a master financial analyst. Your task is to write a clear, concise, and insightful report based *only* on the provided context.

    Here is the user's original query:
    {original_query}

    Here is the context you have gathered from your tools:
    ---
    News Context: {context}
    ---
    Stock Market Context: {stock_context}
    ---

    Synthesize this information into a final report. Do not mention your tools or the context directly. Just provide the report.
    If the context is empty or unhelpful, state that you could not find relevant information.
    """
)

A.2. `AgentState` Definition

# In graph.py
from typing import TypedDict, Dict, List

class AgentState(TypedDict):
    """
    A shared state object passed between agents in the graph.
    """
    original_query: str
    research_plan: Dict 
    search_summary: str
    final_report: str

Synapse: The Multi Agent system for Financial Research

Table of contents

Abstract

Introduction

Related work

Methodology

3.1. System Architecture

3.2. The Agent Team

3.3. Tool Integration

Experiments

Results

Discussion

Conclusion

References

Acknowledgements

Appendix

A.1. Final `FinancialAnalyst` Synthesis Prompt

A.2. `AgentState` Definition

Table of contents

Abstract

Introduction

Related work

Methodology

3.1. System Architecture

3.2. The Agent Team

3.3. Tool Integration

Experiments

Results

Discussion

Conclusion

References

Acknowledgements

Appendix

A.1. Final `FinancialAnalyst` Synthesis Prompt

A.2. `AgentState` Definition

Datasets

Datasets

Code

Code

Table of contents

Abstract

Introduction

Related work

Methodology

3.1. System Architecture

3.2. The Agent Team

3.3. Tool Integration

Experiments

Results

Discussion

Conclusion

References

Acknowledgements

Appendix

A.1. Final FinancialAnalyst Synthesis Prompt

A.2. AgentState Definition

Table of contents

Abstract

Introduction

Related work

Methodology

3.1. System Architecture

3.2. The Agent Team

3.3. Tool Integration

Experiments

Results

Discussion

Conclusion

References

Acknowledgements

Appendix

A.1. Final FinancialAnalyst Synthesis Prompt

A.2. AgentState Definition

Datasets

Datasets

Code

Code

A.1. Final `FinancialAnalyst` Synthesis Prompt

A.2. `AgentState` Definition

A.1. Final `FinancialAnalyst` Synthesis Prompt

A.2. `AgentState` Definition