LLMs often return incorrect or hallucinated answers when asked domain-specific rulebooks. This work describes a minimalist RAG system for answering questions strictly based on the MCC 2017 Laws of Cricket. This approach embeds the official rulebook into a permanent vector database and retrieves context pieces relevant to any query at test time, ensuring that responses are grounded in the document provided alone. The project demonstrates a constrained and reproducible RAG pipeline focused on correctness, transparency, and hallucination avoidance.
General-purpose LLMs are not reliable in answering questions about structured rulebooks such as sports laws. Even when pre-trained on large corpora, models may make up details or depend on information which is no longer up-to-date. For domains in which correctness is critical, responses must be grounded in authoritative documents rather than model priors.
The aim of this project is to design a simple RAG system, which will answer cricket-related questions only with the official MCC 2017 Laws of Cricket without inference or speculation beyond the source material.
The system uses only one authoritative document:
Laws of Cricket 2017 Edition, Marylebone Cricket Club (MCC)
This document is a PDF and is the entire official rulebook. No other documents, summaries, or external data sources are used. This constraint ensures traceability and limits ambiguity in retrieved answers.
It consists of a typical Retrieval-Augmented Generation pipeline:
The MCC 2017 PDF is loaded via a PDF document loader.
The document is divided into overlapping chunks of text using a recursive character-based method to preserve contextual continuity.
Each chunk is embedded by a sentence-transformer embedding model.
This stores embeddings in a persistent Chroma vector database. The database is created once and reused across runs. Retrieval At query time, semantic similarity search returns the top-K most relevant chunks. Generation The retrieved chunks are passed to an LLM-via Groq-with a strict system prompt that instructs the model to answer only from the provided context. If no relevant chunks are retrieved, the system returns a fixed fallback response indicating insufficient information.
To prevent hallucinations, the system incorporates multiple checks:
Generation is not possible without retrieval.
Guessing or inferring is not allowed by the LLM prompt.
Answers are generated only based on retrieved context.
A fixed response is returned if there are no relevant chunks.
This design stresses correctness over completeness of answers.
The system performs qualitatively by interactive queries. The behaviors observed are:
Correct answers for questions directly addressed by the MCC rules.
Consistent refusal to answer at a time when information is not available.
Stable behavior between runs because vectors are stored persistently.
The system does not summarize, interpret, or provide opinion-based reasoning.
Project Repo Link: https://github.com/m-noumanfazil/cricket-rag-cli-using-langchain
No quantitative metrics of evaluation are given.
⢠No automated benchmarking against ground-truth answers.
These limitations are intentional and are meant to keep the system minimal, while focusing on correctness.
This work proposes a minimalist, bounded RAG for rule-based question answering. The proposed system is bounded to single, authoritative source documents and allows only retrieval-based generations. It exemplifies how RAG can be used in order to avoid hallucinations in domain-specific scenarios. The work will serve as a useful reference when developing safe and transparent RAG pipelines.