Finops Conversational agent

Abstract

This work presents a FinOps Conversational Agentic AI System designed to simplify cloud cost management through natural-language interaction. The system combines Retrieval-Augmented Generation (RAG), multi-agent reasoning, and FinOps domain knowledge to deliver insights on cost allocation, optimization, forecasting, and governance. By integrating vector search, LLM reasoning, and automated decision pathways, the system provides an efficient and accessible way for engineering and finance teams to understand and manage cloud expenditure.

Introduction

Cloud spending is increasing across organizations, creating a need for automated systems that provide real-time insights, optimization recommendations, and proactive cost governance. This publication presents a FinOps-focused conversational agentic AI system that enables users to interact with cloud financial data using natural language.
The system integrates Retrieval-Augmented Generation (RAG), multi-agent orchestration, and role-specific workflows to deliver FinOps intelligence. The pipeline includes:

Document ingestion and domain knowledge extraction
Embedding generation and vector-based retrieval
FinOps-specific reasoning agents
LLM-driven conversation layer
Cost analysis, forecasting, and optimization modules

Key Features

RAG pipeline with vector search and semantic reranking
Automatic ingestion of .txt documents from the data folder
Persistent ChromaDB vector database
High-quality HuggingFace MiniLM embeddings
Token-based and paragraph-based text chunking
LLM fallback chain: OpenAI → Groq → Retrieval-only
Clean architecture with separate modules for vector DB and app logic
Environment-driven configuration through .env

Installation

git clone https://github.com/Suchi-BITS/ReadyTensor_Project-1.git
cd <project_folder>
python -m venv venv
venv\Scripts\activate # Windows
python -m pip install --upgrade pip
pip install -r requirements.txt

Project Structure

project/
│
├── data/ # Input documents (.txt files)
│ └── *.txt
│
├── chroma_store/ # Auto-persisted ChromaDB files
│
├── src/
│ ├── app.py # Main RAG application
│ └── vectordb.py # Vector DB wrapper (Chroma + HF + reranker)
│
├── .env # Configuration keys and model selection
└── README.md

Run the app

python src/app.py

How it works

Loads all .txt documents from the data folder
Splits them into optimized chunks
Generates embeddings using HuggingFace
Stores vectors in ChromaDB
Retrieves relevant chunks for a query
Reranks them using CrossEncoder
Constructs final context
Generates an answer using available LLMs

Resetting the Vector Store

To clear Chroma:

from vectordb import VectorDB
v = VectorDB("rag_documents", "sentence-transformers/all-MiniLM-L6-v2")
v.clear_collection()

Or delete the chroma_store folder manually.

Notes

Only .txt files are supported for ingestion.
Restart the application after adding new documents.
All components run on CPU by default.
You can switch to larger embedding models if required.

Architecture Overview

Document Ingestion
Loads all .txt files from the data directory.
Splits documents into meaningful chunks:
Paragraph-based chunking
Token-based chunking using RecursiveCharacterTextSplitter
Embedding Layer
Generates normalized embeddings using HuggingFace MiniLM.
Implemented via langchain-huggingface.
Vector Store
Stores embeddings using Chroma from langchain-chroma.
Data persists automatically inside the chroma_store directory.
Reranking Layer
Uses a CrossEncoder reranker (ms-marco-MiniLM-L-6-v2) to reorder retrieved documents and improve answer accuracy.
LLM Response Layer
Priority order:
OpenAI (gpt-4o-mini)
Groq (llama-3.1-8b-instant)
Retrieval-only fallback if no API keys are available
Console Interface
Simple REPL-style question-and-answer interface.
Displays answer, retrieved context, and document sources.

Conclusion

The FinOps Conversational Agentic AI System provides a practical approach to simplifying cloud financial operations. By integrating RAG, vector retrieval, multi-agent reasoning, and LLM-powered conversation, the system makes complex cost analysis accessible and actionable. This work demonstrates that AI-driven FinOps assistants can enhance decision-making, reduce operational effort, and support effective cost governance in cloud-first organizations.