Finance Analyst AI is a Retrieval-Augmented Generation (RAG) chatbot that enables natural language question-answering over financial documents. Users upload a PDF annual report and the system automatically chunks, embeds, and stores the content in a persistent vector database. At query time, relevant document segments are retrieved and passed to a large language model to generate grounded, document-faithful responses. The system is built with LangChain, ChromaDB, Groq's llama-3.1-8b-instant for generation, and HuggingFace all-MiniLM-L6-v2 for embeddings, served through a two-tab Streamlit interface. It incorporates conversational memory to support multi-turn dialogue and is designed to refuse answers that cannot be supported by the uploaded document.
The system follows a six-stage pipeline. A PDF annual report is uploaded through the Streamlit UI and loaded using LangChain's PyPDFLoader. The extracted text is split into 800-token chunks with a 100-token overlap using RecursiveCharacterTextSplitter to preserve context across boundaries. Each chunk is embedded using HuggingFace's all-MiniLM-L6-v2 model and stored in a ChromaDB persistent vector store on disk.
At inference time, the user's query is embedded using the same model and the top-5 most semantically similar chunks are retrieved. These are injected into a structured prompt defined in YAML, which sets the model's role, constraints, tone, and output format, and explicitly instructs the model to answer only from the retrieved context. Groq's llama-3.1-8b-instant was selected as the generation model for its low-latency inference. Conversational memory is maintained using LangChain's ConversationSummaryMemory, which compresses prior turns into a rolling summary appended to each new prompt.
The chatbot was manually tested against company annual reports and successfully answered a range of financial questions including revenue figures, expense breakdowns, risk factors. Responses were returned in markdown format with bullet points, consistent with the prompt configuration. The persistent vector store correctly survived application restarts, and conversational memory maintained coherent context across multi-turn exchanges. When a question fell outside the scope of the uploaded document, the system returned the configured fallback message rather than hallucinating an answer.
No formal quantitative evaluation was conducted. Assessment was based entirely on manual testing. The primary observed limitation is that answer quality is bounded by PDF text extraction quality — scanned or image-based annual reports without embedded text will yield degraded results.