This publication outlines the development of a production-ready, rule-grounded conversational AI system intended to respond to cricket law questions based solely on the official MCC 2017 Laws of Cricket. The system utilizes a minimalist Retrieval-Augmented Generation (RAG) model with persistent vector storage, semantic retrieval, and highly constrained generation. The system enforces hallucination prevention through mandatory retrieval, carefully controlled prompting, and fixed fallback responses. This project illustrates the potential of grounded AI systems to provide high levels of reliability, transparency, and correctness in rule-grounded domains without reliance on external knowledge or inference.

Introduction
1.1 Background and Motivation
1.2 Problem Statement
1.3 Objectives
1.4 Scope
System Architecture
2.1 Architectural Overview
2.2 Component Layers
Methodology
3.1 Document Ingestion
3.2 Text Processing
3.3 Chunking Strategy
3.4 Embedding Pipeline
3.5 Vector Storage
3.6 Retrieval Process
3.7 Generation Control
3.8 Failure Handling
Although Large Language Models (LLMs) have revolutionized conversational AI, they are unreliable in domains that require very high levels of strict factual correctness. Rulebooks, legal texts, and regulatory literature require grounded answers, not probabilistic outputs. In sports law, for example, incorrect answers can propagate misinformation and misinterpretation of official rules.
This project describes a grounded AI system capable of answering cricket rule questions based solely on the official MCC 2017 Laws of Cricket. The system avoids inference, speculation, and external knowledge, ensuring that every answer is directly traceable to the official source.
General-purpose LLMs are prone to hallucinations when answering structured rule-based queries. They may generate rules out of thin air, refer to outdated knowledge from their training data, or infer information not explicitly stated in official literature, making them unsuitable for domains requiring high levels of correctness.
The main problem with current AI systems is the lack of grounded generation and source-bounded reasoning.
The system follows a modular pipeline architecture that separates data ingestion, vector storage, retrieval, and generation into independent components. This ensures clarity, reproducibility, and maintainability.
This document is a PDF and is the entire official rulebook.
"Laws of Cricket 2017 Edition, Marylebone Cricket Club (MCC)"
No other documents, summaries, or external data sources are used. This constraint ensures traceability and limits ambiguity in retrieved answers.
PDF Loader
Recursive Text Splitter
Sentence-transformer embedding model
Chroma persistent vector database
Semantic similarity search
LLM with strict grounding prompt
CLI application
The MCC 2017 PDF is loaded using a PDF document loader.
Extracted text is normalized for segmentation.
Recursive character-based chunking using:
Chunk size: 500 characters
Overlap: 100 characters
Each chunk is embedded using:
sentence-transformers/all-mpnet-base-v2
Embeddings are stored in a persistent Chroma database to ensure:
### Document Ingestion and Vectorization Pipeline +---------------------+ | MCC PDF | +---------------------+ | v +---------------------+ | PDF Loader | +---------------------+ | v +---------------------+ | Text Chunking | | (Recursive Splitter)| +---------------------+ | v +-----------------------------+ | Sentence-Transformer Embeds | | (all-mpnet-base-v2 model) | +-----------------------------+ | v +-----------------------------+ | Chroma Persistent Vector DB| +-----------------------------+
Semantic similarity search retrieves top-k relevant chunks.
LLM generation is constrained by:
If retrieval fails, system returns:
"I don't know. No relevant information found."
### Query Pipeline +------------------+ | User Input | +--------+---------+ | v +---------------------+ | Query Validation| +--------+------------+ | v +-----------------+ | Retriever | +--------+--------+ | v +----------------------+ | LLM Generation | +--------+-------------+ | v +-----------------------+ | Response Output | +-----------------------+
Strict system instruction enforces context-only answers.
Generation is impossible without retrieval.
Fixed refusal response when no data exists.
Zero-temperature model configuration ensures stability.
Python LangChain ChromaDB HuggingFace Embeddings Groq LLM API dotenv
Handles document ingestion, chunking, embedding, and storage.
Handles retrieval, prompting, and generation.
This project demonstrates a grounded, minimal RAG architecture for rule-based AI systems. By enforcing retrieval-first generation and strict context grounding, the system achieves high reliability and hallucination resistance. It serves as a reference model for safe AI design in correctness-critical domains.
Project Repository: https://github.com/m-noumanfazil/cricket-rag-cli-using-langchain