This work presents a FinOps Conversational Agentic AI System designed to simplify cloud cost management through natural-language interaction. The system combines Retrieval-Augmented Generation (RAG), multi-agent reasoning, and FinOps domain knowledge to deliver insights on cost allocation, optimization, forecasting, and governance. By integrating vector search, LLM reasoning, and automated decision pathways, the system provides an efficient and accessible way for engineering and finance teams to understand and manage cloud expenditure.
Cloud spending is increasing across organizations, creating a need for automated systems that provide real-time insights, optimization recommendations, and proactive cost governance. This publication presents a FinOps-focused conversational agentic AI system that enables users to interact with cloud financial data using natural language.
The system integrates Retrieval-Augmented Generation (RAG), multi-agent orchestration, and role-specific workflows to deliver FinOps intelligence. The pipeline includes:
Document ingestion and domain knowledge extraction
Embedding generation and vector-based retrieval
FinOps-specific reasoning agents
LLM-driven conversation layer
Cost analysis, forecasting, and optimization modules
Key Features
Installation
git clone https://github.com/Suchi-BITS/ReadyTensor_Project-1.git
cd <project_folder>
python -m venv venv
venv\Scripts\activate # Windows
python -m pip install --upgrade pip
pip install -r requirements.txt
Project Structure
project/
ā
āāā data/ # Input documents (.txt files)
ā āāā *.txt
ā
āāā chroma_store/ # Auto-persisted ChromaDB files
ā
āāā src/
ā āāā app.py # Main RAG application
ā āāā vectordb.py # Vector DB wrapper (Chroma + HF + reranker)
ā
āāā .env # Configuration keys and model selection
āāā README.md
Run the app
python src/app.py
How it works
To clear Chroma:
from vectordb import VectorDB
v = VectorDB("rag_documents", "sentence-transformers/all-MiniLM-L6-v2")
v.clear_collection()
Or delete the chroma_store folder manually.
Document Ingestion
Loads all .txt files from the data directory.
Splits documents into meaningful chunks:
Paragraph-based chunking
Token-based chunking using RecursiveCharacterTextSplitter
Embedding Layer
Generates normalized embeddings using HuggingFace MiniLM.
Implemented via langchain-huggingface.
Vector Store
Stores embeddings using Chroma from langchain-chroma.
Data persists automatically inside the chroma_store directory.
Reranking Layer
Uses a CrossEncoder reranker (ms-marco-MiniLM-L-6-v2) to reorder retrieved documents and improve answer accuracy.
LLM Response Layer
Priority order:
OpenAI (gpt-4o-mini)
Groq (llama-3.1-8b-instant)
Retrieval-only fallback if no API keys are available
Console Interface
Simple REPL-style question-and-answer interface.
Displays answer, retrieved context, and document sources.
The FinOps Conversational Agentic AI System provides a practical approach to simplifying cloud financial operations. By integrating RAG, vector retrieval, multi-agent reasoning, and LLM-powered conversation, the system makes complex cost analysis accessible and actionable. This work demonstrates that AI-driven FinOps assistants can enhance decision-making, reduce operational effort, and support effective cost governance in cloud-first organizations.