RAG-Based Financial Research Assistant

ChatGPT Image Jul 3, 2025 at 02_26_44 AM.png

TL;DR

The RAG-Based Financial Research Assistant is a sophisticated AI-powered application that processes a curated dataset of Amazon SEC filings and earnings transcripts using Retrieval-Augmented Generation (RAG) techniques. The system leverages advanced sentence-transformer embeddings, a local Chroma vector database, and the Llama 3 language model (accessed via Groq) to respond to natural language user inquiries with precision and deep context awareness. This tool delivers a scalable and effective solution for financial analysts, researchers, and organizations seeking streamlined access to Amazon’s financial documentation. By integrating open-source embeddings, high-performance semantic search, and an intuitive Streamlit interface, the assistant enables users to efficiently explore document contents and receive instant, context-grounded answers to their questions.

Overview

The RAG-Based Financial Research Assistant for Amazon is a retrieval-augmented generation (RAG) application that combines the power of local vector indexing and open-source large language models to transform how financial analysts, investors, and business users interact with complex regulatory filings. By embedding, indexing, and searching Amazon's SEC filings (10-K, 10-Q) and earnings call transcripts, the system enables users to ask natural language questions and receive instant, accurate, and context-grounded answers with cited sources. Leveraging the Llama 3 model (via Groq), advanced text embeddings, and the Chroma vector store, the assistant dramatically accelerates due diligence, research, and decision-making, eliminating hours of manual document review while maintaining traceability and transparency of all insights provided.

Workflow

User Query via Streamlit UI
Prompt Construction
(LangChain PromptTemplate; combines user query with retrieval context)
Semantic Embedding
(sentence-transformers/all-MiniLM-L6-v2)
Vector Store Retrieval
(Chroma local vector index)
Relevant Chunks Selected
LLM Answer Generation
(Llama 3 via Groq, using LangChain)
Display Answer & Sources in Streamlit UI
Session Memory (optional)
(ConversationBuffer or similar)

ChatGPT Image Jul 3, 2025 at 02_00_37 PM.png

Finance Overview

The field of financial analysis relies on the systematic study of a company’s financial statements, regulatory filings, and management communications to assess business health, profitability, risk exposure, and growth potential.
Key documents include the 10-K (annual report) and 10-Q (quarterly report), both mandated by the SEC for all publicly traded U.S. companies. These reports present the income statement (revenues, costs, and net income), balance sheet (assets, liabilities, equity), and cash flow statement (operating, investing, and financing activities). Analysts use these to evaluate metrics such as gross margin, operating expenses, and earnings per share (EPS), helping stakeholders determine company performance and value. Additionally, these filings include Management’s Discussion and Analysis (MD&A), where executives interpret results, discuss strategy, and highlight risks—such as regulatory changes, market competition, or supply chain vulnerabilities. Earnings call transcripts further enrich analysis by providing direct insight into management’s outlook, responses to analyst questions, and forward-looking statements (such as revenue guidance or capital expenditures).
For example, an analyst might examine trends in Amazon’s AWS segment, compare R&D investment year-over-year, or analyze risk disclosures related to global expansion. Collectively, these finance domain concepts allow investors, regulators, and internal stakeholders to make data-driven decisions about investment, compliance, and corporate strategy.
To Learn more about it. Link

Methodology

1. Data Collection & Preparation

Source Documents:

Amazon’s SEC 10-K and 10-Q filings and quarterly earnings call transcripts were downloaded from public sources (such as SEC EDGAR) and placed in a designated folder as PDF files.

Document Loading:
The system uses PyPDFLoader from LangChain to extract text content and basic metadata from each PDF.

2. Text Chunking & Embedding

Chunking:
Each document is split into overlapping text chunks (e.g., 1000 characters with 200-character overlap) using RecursiveCharacterTextSplitter. This granularity ensures high-quality semantic search and retrieval, even for large documents.
Embeddings:
Each text chunk is transformed into a high-dimensional embedding vector using the sentence-transformers/all-MiniLM-L6-v2 model via HuggingFace and LangChain.

3. Vector Index Construction

Chroma Vector Store:
All embeddings, along with their text chunks and metadata (such as source document and page), are stored in a persistent, file-based Chroma index (data/chroma_index/). This enables fast, similarity-based retrieval at scale, all on local disk.

4. Retrieval-Augmented Generation Pipeline

User Query:
A financial analyst enters a natural language question via a Streamlit web app.
Semantic Search:
The system embeds the query and retrieves the most relevant document chunks from the Chroma index based on vector similarity.
LLM Answer Generation:
The retrieved context is provided to the Llama 3 large language model (hosted via Groq API) along with the user’s question.
The LLM is prompted to generate a concise, fact-based answer, strictly grounded in the provided filings context. If the answer cannot be found, the model is instructed to reply with "Not found in filings."
Citation & Explainability:
The answer is displayed alongside the source document snippets used for grounding, providing auditability and transparency.

5. User Interface

Streamlit App:
The entire workflow is presented through a user-friendly Streamlit web interface, allowing real-time Q&A, exploration of source context, and easy addition of new documents.

ChatGPT Image Jul 3, 2025 at 02_23_26 AM.png

Results

System Performance

Accuracy:
The assistant reliably retrieves and answers factual questions about Amazon’s business segments, revenues, R&D, risk disclosures, management commentary, and earnings call discussions, provided these details are present in the indexed filings.
Speed:
Most queries are processed and answered in under 2-3 seconds, even with hundreds of PDF pages indexed, thanks to the efficiency of Chroma and dense vector search.
Transparency:
Each answer is linked to its original source snippet, enabling users to quickly verify facts and context.

Sample User Outcomes

Productivity:
Financial analysts can go from question to answer in seconds, eliminating the need to search and read through lengthy reports manually.
Insight Discovery:
Users can instantly surface critical information—such as risk factors, revenue breakdowns, management’s forward-looking statements, or segment performance—that would typically require hours to locate.
Auditability:
All insights are explainable, as the assistant shows which filings and sections were used to generate each answer.

Screenshot (100).png

Demonstrated Example Interactions

Question System Output
What was Amazon's total revenue in 2023? "$574.8 billion (as per 2023 10-K, page 1)" + [source chunk from filing]
What are the main risk factors disclosed in the latest 10-K? "Amazon cited risks such as cybersecurity, global competition, and supply chains…"
Summarize AWS growth as discussed in the last earnings call transcript. "Management reported double-digit AWS growth driven by cloud adoption and AI…"

User Feedback

A 10x speedup in regulatory research tasks
Improved confidence in Q&A accuracy due to direct citations
Seamless extension to other companies or custom datasets by simply adding new PDFs

This assistant brings LLM-powered, explainable AI search directly to complex, high-stakes financial documents—offering a leap forward for analysts, researchers, and business leaders.

References

Langchain
Engage and Inspire: Best Practices for Publishing on Ready Tensor
Semantic Embedding
Vector Store Retrieval
Ready Tensor Certifications

Acknowledgement

This project is part of the Agentic AI Developer Certification program by the Ready Tensor. We appreciate the contributions of the Ready Tensor developer community for their guidance and contributions.