This publication presents a Retrieval-Augmented Generation (RAG) system designed to query diabetic research documents, specifically focusing on traditional Indian medical methodologies. The system enables semantic search and natural language question answering over a curated set of PDFs, providing insights into Ayurvedic and other indigenous approaches to diabetes management.
Data Collection
Gathered diabetic research PDFs related to traditional Indian medicine
Extracted text using PDF parsers and cleaned
Chunking Strategy
Applied semantic chunking
Embedding Model
Used multilingual sentence embeddings
Vector Store
Indexed chunks for efficient similarity search.
Query Flow
User query → Embedding → Similarity search → Context retrieved → LLM generates answer.
Accuracy: The system successfully retrieved relevant content