The RAG-Based AI Banking Assistant is a fully implemented Retrieval-Augmented Generation system designed to provide accurate, document-aware responses using your own dataset. It integrates ChromaDB, HuggingFace embeddings, and multiple Large Language Model (LLM) providers to deliver an intelligent, consistent questionβanswering experience.
This system essentially functions as:
βA ChatGPT-like assistant that understands and answers using your documents.β
The project enables querying information stored in local .txt documents through a complete RAG pipeline that includes:
It automatically selects an available LLM provider from OpenAI, Groq, or Google Gemini, ensuring seamless execution.
This AI Banking Assistant is designed to:
The complete system operates through the following stages:
Reads .txt files from the data/ directory.
Splits large text inputs into structured chunks using
LangChainβs RecursiveCharacterTextSplitter, improving retrieval accuracy.
Converts text chunks into vector embeddings using
all-MiniLM-L6-v2 from HuggingFace SentenceTransformers.
Stores embeddings inside ChromaDB with persistent local storage under chroma_db/.
Retrieves the top-k most relevant document chunks from the vector database.
A custom RAG prompt combines:
- Retrieved context
- User question
and forwards it to the selected LLM.
LLM selection priority:
β Fully functional RAG pipeline
β Supports multiple LLM providers
β Persistent vector storage with ChromaDB
β Automated document chunking
β HuggingFace-based embeddings
β Structured and modular codebase
β Zero manual configuration inside code
β Works with any .txt dataset
AAIDC-Project-01-AI-Banking-Assistant/ βββ src/ β βββ app.py # Main RAG engine β βββ vectordb.py # Chroma + Embeddings + Chunking + Search βββ data/ β βββ banking_data01.txt β βββ banking_data02.txt βββ chroma_db/ # Auto-created persistent DB βββ requirements.txt βββ .env.example βββ README.md
1οΈβ£ Install Dependencies
pip install -r requirements.txt
2οΈβ£ Configure Your API Key
Add your API key to .env:
OPENAI_API_KEY=your_key_here OR GROQ_API_KEY=your_key_here OR GOOGLE_API_KEY=your_key_here
The system will automatically select the first available one.
3οΈβ£ Add Your Documents
Place .txt files into the data/ folder.
Example:
data/ βββ banking_data01.txt βββ banking_data02.txt
4οΈβ£ Run the RAG Assistant
python src/app.py
Example interaction:
Your question: What is KYC in banking? ANSWER: KYC (Know Your Customer) is a process...
Test Chunking
from src.vectordb import VectorDB v = VectorDB() print(v.chunk_text("Sample document test"))
Test Vector Search
v.search("banking")
Test Full RAG
Run:
python src/app.py
Ask:
Explain the NEFT payment system.
π οΈ Tech Stack
This RAG-based AI system demonstrates how retrieval and generation can be effectively combined to build a domain-specific, intelligent banking assistant. Its modular design, multi-provider LLM support, and persistent vector storage make it suitable for real-world applications and future enhancements.
Wahid Jamadar
B.Tech CSE
DY Patil Agriculture & Technical University, Kolhapur