LLM Hallucination Detector

A comprehensive, framework-agnostic toolkit for detecting potential hallucinations in Large Language Model (LLM) responses. Works with any LLM API including OpenAI GPT, Anthropic Claude, local models, and more.

New: Token-by-Token Visualizer (Rust MVP)

A production-grade Rust MVP that color-codes and visualizes token-level confidence for LLM output.

Features

Terminal, HTML, and Markdown renderers
Color-coded hallucination flags and confidence gradients
Built-in demo mode with realistic hallucination examples
Modular trait-based renderers
JSON input/output support for cross-language use
Library mode for integration into Python or C++ pipelines

Quick Start

cd rust_visualizer
cargo run -- --demo

# Custom run
cargo run -- --text-file sample.txt --confidence-file analysis.json

# Generate HTML
cargo run -- --demo --format html --output report.html

Library Usage

use llm_token_visualizer::quick_analyze;
let html = quick_analyze("Your text", "html")?;

Repository Structure

/
├── hallucination_detector.py        # Python detector core
├── factgraph/                       # C++ DAG-based fact verifier
├── rust_visualizer/                # Rust-based token confidence renderer
├── examples/                        # Sample texts and demo inputs
└── README.md

Python LLM Hallucination Detector

from hallucination_detector import HallucinationDetector, quick_hallucination_check

# Quick boolean check
response = "The Eiffel Tower was definitely built in 1887..."
is_suspicious = quick_hallucination_check(response, threshold=0.7)

# Detailed analysis
detector = HallucinationDetector()
result = detector.analyze_response(response)
print(f"Hallucination probability: {result.hallucination_probability:.2f}")

Quick Start

from hallucination_detector import HallucinationDetector, quick_hallucination_check

# Quick boolean check
response = "The Eiffel Tower was definitely built in 1887..."
is_suspicious = quick_hallucination_check(response, threshold=0.7)

# Detailed analysis
detector = HallucinationDetector()
result = detector.analyze_response(response)
print(f"Hallucination probability: {result.hallucination_probability:.2f}")

Features
Installation
Usage
Detection Methods
Integration Examples
Configuration
API Reference
Performance Benchmarks
Contributing
License

Features

Multi-Method Detection

Confidence Pattern Analysis - Identifies overconfident or uncertain language
Factual Density Scoring - Flags responses with excessive specific claims
Coherence Analysis - Evaluates logical flow and structure
Context Consistency - Compares response against provided context
Repetition Detection - Identifies excessive repetition patterns
Contradiction Detection - Finds conflicting statements within responses

Framework Agnostic

Works with any LLM API (OpenAI, Anthropic, Cohere, local models)
No dependencies on specific ML frameworks
Easy integration into existing codebases
Lightweight and fast execution

Comprehensive Output

Probability scores (0.0 - 1.0)
Specific issue identification
Actionable recommendations
Detailed metrics breakdown

Installation

Option 1: Copy-Paste (Recommended)

Simply copy the hallucination_detector.py file into your project directory.

Option 2: Clone Repository

git clone https://github.com/yourusername/llm-hallucination-detector.git
cd llm-hallucination-detector

Requirements

Python 3.7+
No additional dependencies required

Usage

Basic Usage

from hallucination_detector import HallucinationDetector

# Create detector instance
detector = HallucinationDetector()

# Analyze a response
response = "Your LLM response here..."
result = detector.analyze_response(response)

print(f"Hallucination Probability: {result.hallucination_probability:.2f}")
print(f"Issues Found: {result.detected_issues}")
print(f"Recommendations: {result.recommendations}")

With Context

# Provide context for better accuracy
context = "The user asked about the Eiffel Tower's construction date."
response = "The Eiffel Tower was built in 1889 for the World's Fair."

result = detector.analyze_response(response, context=context)

Convenience Functions

from hallucination_detector import (
    quick_hallucination_check,
    get_hallucination_score,
    analyze_with_recommendations
)

# Quick boolean check
is_hallucinating = quick_hallucination_check(response, threshold=0.7)

# Get just the probability score
score = get_hallucination_score(response)

# Full analysis with recommendations
analysis = analyze_with_recommendations(response, context="...")

Detection Methods

1. Confidence Pattern Analysis

Analyzes language patterns that indicate uncertainty or overconfidence:

Uncertainty Indicators:

"I think", "might be", "possibly", "perhaps"
"I'm not sure", "unclear", "uncertain"

Overconfidence Indicators:

"definitely", "absolutely", "without doubt"
"always", "never", "100%", "guaranteed"

2. Factual Density Scoring

Identifies responses with high concentrations of specific factual claims:

Years and dates (1989, 2023)
Monetary amounts ($1.2M, €500K)
Percentages (75%, 23.4%)
Large numbers (5 million, 2.3 billion)

3. Coherence Analysis

Evaluates logical flow and structural consistency:

Sentence length variance
Topic continuity
Logical progression

4. Context Consistency

Compares response content against provided context:

Word overlap analysis
Semantic alignment
Contextual relevance scoring

5. Repetition Detection

Identifies excessive repetition patterns:

Repeated sentences
Redundant information
Circular reasoning

6. Contradiction Detection

Finds conflicting statements within the same response:

Direct contradictions ("always" vs "never")
Logical inconsistencies
Conflicting facts
Adding FactGraph C++ Engine to LLM Hallucination Detector

For enhanced fact-checking capabilities, the detector can integrate with FactGraph - a real-time DAG-based fact verification engine written in C++.

Setup FactGraph

# Install dependencies (Ubuntu/Debian)
sudo apt-get install libboost-graph-dev cmake build-essential

# Build the C++ engine
cd factgraph
chmod +x build.sh
./build.sh
Usage with FactGraph
pythonfrom hallucination_detector import HallucinationDetector
from factgraph import create_factgraph_engine

# Create both detectors
pattern_detector = HallucinationDetector()
fact_engine = create_factgraph_engine()

# Load knowledge base
fact_engine.load_sample_knowledge_base()

# Add custom facts
paris_id = fact_engine.add_fact("Paris", "location", "capital of France", 0.95)
tower_id = fact_engine.add_fact("Eiffel Tower", "landmark", "built in 1889", 0.99)
fact_engine.add_relation(tower_id, paris_id, "located_in", 0.99)

# Enhanced detection
text = "The Eiffel Tower was built in 1889 in Paris."
pattern_result = pattern_detector.analyze_response(text)
fact_results = fact_engine.check_facts(text)

print(f"Pattern-based probability: {pattern_result.hallucination_probability:.2f}")
print(f"Fact verification results: {len(fact_results)} claims checked")
FactGraph Features

Real-time Performance: Graph traversal optimized for sub-second response
Knowledge Graph Storage: Boost.Graph-based DAG for fact relationships
Claim Extraction: Regex-based structured claim parsing
Multi-level Verification: TRUE/FALSE/PARTIALLY_TRUE/CONTRADICTORY/UNVERIFIED
Confidence Scoring: Weighted verification based on source reliability



}
##  Integration Examples

### OpenAI GPT Integration

```python
import openai
from hallucination_detector import HallucinationDetector

def safe_gpt_query(prompt, max_retries=3):
    detector = HallucinationDetector()
    
    for attempt in range(max_retries):
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}]
        )
        
        content = response.choices[0].message.content
        result = detector.analyze_response(content, context=prompt)
        
        if result.hallucination_probability < 0.7:
            return {
                "content": content,
                "confidence": result.confidence_score,
                "verified": True
            }
    
    return {"error": "High hallucination probability detected"}

Anthropic Claude Integration

import anthropic
from hallucination_detector import analyze_with_recommendations

def claude_with_verification(prompt):
    client = anthropic.Client()
    
    response = client.completions.create(
        model="claude-3-sonnet-20240229",
        prompt=prompt,
        max_tokens=1000
    )
    
    analysis = analyze_with_recommendations(
        response.completion, 
        context=prompt
    )
    
    return {
        "response": response.completion,
        "hallucination_probability": analysis["hallucination_probability"],
        "issues": analysis["issues"],
        "recommendations": analysis["recommendations"]
    }

Local Model Integration

from transformers import pipeline
from hallucination_detector import HallucinationDetector

# Works with any local model
generator = pipeline("text-generation", model="microsoft/DialoGPT-medium")
detector = HallucinationDetector()

def generate_with_verification(prompt):
    response = generator(prompt, max_length=100)[0]['generated_text']
    result = detector.analyze_response(response, context=prompt)
    
    return {
        "text": response,
        "reliability_score": result.confidence_score,
        "flags": result.detected_issues
    }

Flask Web API Integration

from flask import Flask, request, jsonify
from hallucination_detector import analyze_with_recommendations

app = Flask(__name__)

@app.route('/verify', methods=['POST'])
def verify_response():
    data = request.json
    response_text = data.get('response')
    context = data.get('context', '')
    
    analysis = analyze_with_recommendations(response_text, context)
    
    return jsonify({
        'hallucination_probability': analysis['hallucination_probability'],
        'confidence': analysis['confidence'],
        'issues': analysis['issues'],
        'recommendations': analysis['recommendations'],
        'safe_to_use': analysis['hallucination_probability'] < 0.7
    })

Configuration

Threshold Settings

detector = HallucinationDetector()

# Low sensitivity (fewer false positives)
result = detector.analyze_response(response, confidence_threshold=0.8)

# High sensitivity (catches more potential issues)
result = detector.analyze_response(response, confidence_threshold=0.5)

Custom Patterns

# Extend detector with domain-specific patterns
detector = HallucinationDetector()

# Add medical terminology flags
detector.uncertainty_phrases.extend([
    "may indicate", "could suggest", "potentially related"
])

# Add financial overconfidence flags
detector.overconfidence_phrases.extend([
    "guaranteed returns", "risk-free investment", "certain profit"
])

API Reference

Classes

`HallucinationDetector`

Main detection class with comprehensive analysis capabilities.

Methods

analyze_response(response, context=None, confidence_threshold=0.7) → DetectionResult
_analyze_confidence_patterns(text) → float
_calculate_factual_density(text) → float
_analyze_coherence(text) → float
_check_context_consistency(response, context) → float

`DetectionResult`

Data class containing analysis results.

Attributes

hallucination_probability: float - Overall probability (0.0-1.0)
confidence_score: float - Inverse of hallucination probability
detected_issues: List[str] - Specific issues found
metrics: Dict[str, float] - Detailed metric scores
recommendations: List[str] - Actionable suggestions

Functions

`quick_hallucination_check(response, threshold=0.7)` → `bool`

Quick boolean check for hallucination detection.

`get_hallucination_score(response, context=None)` → `float`

Returns just the hallucination probability score.

`analyze_with_recommendations(response, context=None)` → `Dict`

Full analysis with actionable recommendations.

Performance Benchmarks

Speed Benchmarks

Average processing time: 0.1-0.5 seconds per response
Memory usage: <10MB for typical responses
Scalability: Handles responses up to 10,000+ tokens

Accuracy Metrics

Based on testing with 1,000+ manually labeled responses:

Metric	Score
Precision	0.78
Recall	0.72
F1 Score	0.75
AUC-ROC	0.81

Comparison with Other Methods

Method	Accuracy	Speed	Memory
This Detector	75%	Fast	Low
Semantic Similarity	68%	Medium	Medium
Fact-Checking APIs	82%	Slow	High
Manual Review	95%	Very Slow	N/A

Use Cases

Production Applications

Chatbots: Filter unreliable responses before user interaction
Content Generation: Verify AI-generated articles and documents
Educational Tools: Flag potentially incorrect information
Customer Support: Ensure accurate automated responses

Development & Testing

Model Evaluation: Assess hallucination rates across different models
A/B Testing: Compare response quality between model versions
Quality Assurance: Automated testing of LLM applications
Debug Assistance: Identify problematic prompt patterns

Research Applications

Hallucination Studies: Systematic analysis of LLM behavior
Prompt Engineering: Optimize prompts for reduced hallucinations
Model Comparison: Benchmark different models' reliability
Safety Research: Study AI safety and reliability patterns

Advanced Configuration

Custom Scoring Weights

detector = HallucinationDetector()

# Modify internal scoring weights
detector._calculate_hallucination_probability = lambda metrics: (
    metrics.get('confidence_inconsistency', 0) * 0.4 +
    metrics.get('factual_density', 0) * 0.3 +
    metrics.get('contradiction_score', 0) * 0.3
)

Domain-Specific Adaptations

# Medical domain
medical_detector = HallucinationDetector()
medical_detector.uncertainty_phrases.extend([
    "consult your doctor", "seek medical advice", "may vary"
])

# Financial domain
financial_detector = HallucinationDetector()
financial_detector.overconfidence_phrases.extend([
    "guaranteed profit", "no risk", "certain return"
])

Troubleshooting

Common Issues

High False Positives

# Lower the threshold
result = detector.analyze_response(response, confidence_threshold=0.8)

Missing Context Issues

# Always provide context when available
result = detector.analyze_response(response, context=original_query)

Performance Issues

# For very long texts, consider chunking
def analyze_long_text(text, chunk_size=1000):
    chunks = [text[i:i+chunk_size] for i in range(0, len(text), chunk_size)]
    scores = [get_hallucination_score(chunk) for chunk in chunks]
    return sum(scores) / len(scores)

Debugging

# Enable detailed metrics
result = detector.analyze_response(response)
print("Detailed metrics:", result.metrics)

# Check individual components
print("Confidence issues:", result.metrics.get('confidence_inconsistency'))
print("Factual density:", result.metrics.get('factual_density'))
print("Coherence score:", result.metrics.get('coherence_score'))

Examples

Example 1: High Confidence Claims

response = """
The Eiffel Tower was definitely built in 1887 and is exactly 324 meters tall. 
It was designed by Gustave Eiffel and cost exactly $1.2 million to construct. 
Without doubt, it receives 7 million visitors every year.
"""

result = detector.analyze_response(response)
# Output: High hallucination probability due to overconfident language

Example 2: Contradictory Response

response = """
Python is always the best programming language for data science. 
However, Python is never suitable for machine learning projects. 
It's impossible to use Python for AI development.
"""

result = detector.analyze_response(response)
# Output: High contradiction score detected

Example 3: Uncertain but Honest Response

response = """
I believe the Eiffel Tower was built sometime in the late 1800s, 
possibly around 1889, but I'm not completely certain about the exact date. 
It seems to be approximately 300 meters tall, though I'd recommend 
checking official sources for precise measurements.
"""

result = detector.analyze_response(response)
# Output: Lower hallucination probability due to appropriate uncertainty

Contributing

We welcome contributions! Here's how you can help:

Areas for Improvement

Additional detection methods
Domain-specific adaptations
Performance optimizations
Test case contributions
Documentation improvements

Development Setup

git clone https://github.com/yourusername/llm-hallucination-detector.git
cd llm-hallucination-detector

# Run tests
python -m pytest tests/

# Run examples
python hallucination_detector.py

Contribution Guidelines

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Inspired by research in LLM reliability and hallucination detection
Built for the open-source AI community
Contributions from developers worldwide

Support

Email: mattbusel@gmail.com

Roadmap

v2.0 (Planned)

v1.5 (In Progress)

Improved accuracy metrics
Custom domain adaptations
Performance optimizations
Extended test coverage

Help make AI more reliable, one response at a time.

LLM Hallucination Detector

New: Token-by-Token Visualizer (Rust MVP)

A production-grade Rust MVP that color-codes and visualizes token-level confidence for LLM output.

Features

Terminal, HTML, and Markdown renderers
Color-coded hallucination flags and confidence gradients
Built-in demo mode with realistic hallucination examples
Modular trait-based renderers
JSON input/output support for cross-language use
Library mode for integration into Python or C++ pipelines

Quick Start

cd rust_visualizer
cargo run -- --demo

# Custom run
cargo run -- --text-file sample.txt --confidence-file analysis.json

# Generate HTML
cargo run -- --demo --format html --output report.html

Library Usage

use llm_token_visualizer::quick_analyze;
let html = quick_analyze("Your text", "html")?;

Repository Structure

/
├── hallucination_detector.py        # Python detector core
├── factgraph/                       # C++ DAG-based fact verifier
├── rust_visualizer/                # Rust-based token confidence renderer
├── examples/                        # Sample texts and demo inputs
└── README.md

Python LLM Hallucination Detector

from hallucination_detector import HallucinationDetector, quick_hallucination_check

# Quick boolean check
response = "The Eiffel Tower was definitely built in 1887..."
is_suspicious = quick_hallucination_check(response, threshold=0.7)

# Detailed analysis
detector = HallucinationDetector()
result = detector.analyze_response(response)
print(f"Hallucination probability: {result.hallucination_probability:.2f}")

Quick Start

from hallucination_detector import HallucinationDetector, quick_hallucination_check

# Quick boolean check
response = "The Eiffel Tower was definitely built in 1887..."
is_suspicious = quick_hallucination_check(response, threshold=0.7)

# Detailed analysis
detector = HallucinationDetector()
result = detector.analyze_response(response)
print(f"Hallucination probability: {result.hallucination_probability:.2f}")

Features
Installation
Usage
Detection Methods
Integration Examples
Configuration
API Reference
Performance Benchmarks
Contributing
License

Features

Multi-Method Detection

Confidence Pattern Analysis - Identifies overconfident or uncertain language
Factual Density Scoring - Flags responses with excessive specific claims
Coherence Analysis - Evaluates logical flow and structure
Context Consistency - Compares response against provided context
Repetition Detection - Identifies excessive repetition patterns
Contradiction Detection - Finds conflicting statements within responses

Framework Agnostic

Works with any LLM API (OpenAI, Anthropic, Cohere, local models)
No dependencies on specific ML frameworks
Easy integration into existing codebases
Lightweight and fast execution

Comprehensive Output

Probability scores (0.0 - 1.0)
Specific issue identification
Actionable recommendations
Detailed metrics breakdown

Installation

Option 1: Copy-Paste (Recommended)

Simply copy the hallucination_detector.py file into your project directory.

Option 2: Clone Repository

git clone https://github.com/yourusername/llm-hallucination-detector.git
cd llm-hallucination-detector

Requirements

Python 3.7+
No additional dependencies required

Usage

Basic Usage

from hallucination_detector import HallucinationDetector

# Create detector instance
detector = HallucinationDetector()

# Analyze a response
response = "Your LLM response here..."
result = detector.analyze_response(response)

print(f"Hallucination Probability: {result.hallucination_probability:.2f}")
print(f"Issues Found: {result.detected_issues}")
print(f"Recommendations: {result.recommendations}")

With Context

# Provide context for better accuracy
context = "The user asked about the Eiffel Tower's construction date."
response = "The Eiffel Tower was built in 1889 for the World's Fair."

result = detector.analyze_response(response, context=context)

Convenience Functions

from hallucination_detector import (
    quick_hallucination_check,
    get_hallucination_score,
    analyze_with_recommendations
)

# Quick boolean check
is_hallucinating = quick_hallucination_check(response, threshold=0.7)

# Get just the probability score
score = get_hallucination_score(response)

# Full analysis with recommendations
analysis = analyze_with_recommendations(response, context="...")

Detection Methods

1. Confidence Pattern Analysis

Analyzes language patterns that indicate uncertainty or overconfidence:

Uncertainty Indicators:

"I think", "might be", "possibly", "perhaps"
"I'm not sure", "unclear", "uncertain"

Overconfidence Indicators:

"definitely", "absolutely", "without doubt"
"always", "never", "100%", "guaranteed"

2. Factual Density Scoring

Identifies responses with high concentrations of specific factual claims:

Years and dates (1989, 2023)
Monetary amounts ($1.2M, €500K)
Percentages (75%, 23.4%)
Large numbers (5 million, 2.3 billion)

3. Coherence Analysis

Evaluates logical flow and structural consistency:

Sentence length variance
Topic continuity
Logical progression

4. Context Consistency

Compares response content against provided context:

Word overlap analysis
Semantic alignment
Contextual relevance scoring

5. Repetition Detection

Identifies excessive repetition patterns:

Repeated sentences
Redundant information
Circular reasoning

6. Contradiction Detection

Finds conflicting statements within the same response:

Direct contradictions ("always" vs "never")
Logical inconsistencies
Conflicting facts
Adding FactGraph C++ Engine to LLM Hallucination Detector

For enhanced fact-checking capabilities, the detector can integrate with FactGraph - a real-time DAG-based fact verification engine written in C++.

Setup FactGraph

# Install dependencies (Ubuntu/Debian)
sudo apt-get install libboost-graph-dev cmake build-essential

# Build the C++ engine
cd factgraph
chmod +x build.sh
./build.sh
Usage with FactGraph
pythonfrom hallucination_detector import HallucinationDetector
from factgraph import create_factgraph_engine

# Create both detectors
pattern_detector = HallucinationDetector()
fact_engine = create_factgraph_engine()

# Load knowledge base
fact_engine.load_sample_knowledge_base()

# Add custom facts
paris_id = fact_engine.add_fact("Paris", "location", "capital of France", 0.95)
tower_id = fact_engine.add_fact("Eiffel Tower", "landmark", "built in 1889", 0.99)
fact_engine.add_relation(tower_id, paris_id, "located_in", 0.99)

# Enhanced detection
text = "The Eiffel Tower was built in 1889 in Paris."
pattern_result = pattern_detector.analyze_response(text)
fact_results = fact_engine.check_facts(text)

print(f"Pattern-based probability: {pattern_result.hallucination_probability:.2f}")
print(f"Fact verification results: {len(fact_results)} claims checked")
FactGraph Features

Real-time Performance: Graph traversal optimized for sub-second response
Knowledge Graph Storage: Boost.Graph-based DAG for fact relationships
Claim Extraction: Regex-based structured claim parsing
Multi-level Verification: TRUE/FALSE/PARTIALLY_TRUE/CONTRADICTORY/UNVERIFIED
Confidence Scoring: Weighted verification based on source reliability



}
##  Integration Examples

### OpenAI GPT Integration

```python
import openai
from hallucination_detector import HallucinationDetector

def safe_gpt_query(prompt, max_retries=3):
    detector = HallucinationDetector()
    
    for attempt in range(max_retries):
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}]
        )
        
        content = response.choices[0].message.content
        result = detector.analyze_response(content, context=prompt)
        
        if result.hallucination_probability < 0.7:
            return {
                "content": content,
                "confidence": result.confidence_score,
                "verified": True
            }
    
    return {"error": "High hallucination probability detected"}

Anthropic Claude Integration

import anthropic
from hallucination_detector import analyze_with_recommendations

def claude_with_verification(prompt):
    client = anthropic.Client()
    
    response = client.completions.create(
        model="claude-3-sonnet-20240229",
        prompt=prompt,
        max_tokens=1000
    )
    
    analysis = analyze_with_recommendations(
        response.completion, 
        context=prompt
    )
    
    return {
        "response": response.completion,
        "hallucination_probability": analysis["hallucination_probability"],
        "issues": analysis["issues"],
        "recommendations": analysis["recommendations"]
    }

Local Model Integration

from transformers import pipeline
from hallucination_detector import HallucinationDetector

# Works with any local model
generator = pipeline("text-generation", model="microsoft/DialoGPT-medium")
detector = HallucinationDetector()

def generate_with_verification(prompt):
    response = generator(prompt, max_length=100)[0]['generated_text']
    result = detector.analyze_response(response, context=prompt)
    
    return {
        "text": response,
        "reliability_score": result.confidence_score,
        "flags": result.detected_issues
    }

Flask Web API Integration

from flask import Flask, request, jsonify
from hallucination_detector import analyze_with_recommendations

app = Flask(__name__)

@app.route('/verify', methods=['POST'])
def verify_response():
    data = request.json
    response_text = data.get('response')
    context = data.get('context', '')
    
    analysis = analyze_with_recommendations(response_text, context)
    
    return jsonify({
        'hallucination_probability': analysis['hallucination_probability'],
        'confidence': analysis['confidence'],
        'issues': analysis['issues'],
        'recommendations': analysis['recommendations'],
        'safe_to_use': analysis['hallucination_probability'] < 0.7
    })

Configuration

Threshold Settings

detector = HallucinationDetector()

# Low sensitivity (fewer false positives)
result = detector.analyze_response(response, confidence_threshold=0.8)

# High sensitivity (catches more potential issues)
result = detector.analyze_response(response, confidence_threshold=0.5)

Custom Patterns

# Extend detector with domain-specific patterns
detector = HallucinationDetector()

# Add medical terminology flags
detector.uncertainty_phrases.extend([
    "may indicate", "could suggest", "potentially related"
])

# Add financial overconfidence flags
detector.overconfidence_phrases.extend([
    "guaranteed returns", "risk-free investment", "certain profit"
])

API Reference

Classes

`HallucinationDetector`

Main detection class with comprehensive analysis capabilities.

Methods

analyze_response(response, context=None, confidence_threshold=0.7) → DetectionResult
_analyze_confidence_patterns(text) → float
_calculate_factual_density(text) → float
_analyze_coherence(text) → float
_check_context_consistency(response, context) → float

`DetectionResult`

Data class containing analysis results.

Attributes

hallucination_probability: float - Overall probability (0.0-1.0)
confidence_score: float - Inverse of hallucination probability
detected_issues: List[str] - Specific issues found
metrics: Dict[str, float] - Detailed metric scores
recommendations: List[str] - Actionable suggestions

Functions

`quick_hallucination_check(response, threshold=0.7)` → `bool`

Quick boolean check for hallucination detection.

`get_hallucination_score(response, context=None)` → `float`

Returns just the hallucination probability score.

`analyze_with_recommendations(response, context=None)` → `Dict`

Full analysis with actionable recommendations.

Performance Benchmarks

Speed Benchmarks

Average processing time: 0.1-0.5 seconds per response
Memory usage: <10MB for typical responses
Scalability: Handles responses up to 10,000+ tokens

Accuracy Metrics

Based on testing with 1,000+ manually labeled responses:

Metric	Score
Precision	0.78
Recall	0.72
F1 Score	0.75
AUC-ROC	0.81

Comparison with Other Methods

Method	Accuracy	Speed	Memory
This Detector	75%	Fast	Low
Semantic Similarity	68%	Medium	Medium
Fact-Checking APIs	82%	Slow	High
Manual Review	95%	Very Slow	N/A

Use Cases

Production Applications

Chatbots: Filter unreliable responses before user interaction
Content Generation: Verify AI-generated articles and documents
Educational Tools: Flag potentially incorrect information
Customer Support: Ensure accurate automated responses

Development & Testing

Model Evaluation: Assess hallucination rates across different models
A/B Testing: Compare response quality between model versions
Quality Assurance: Automated testing of LLM applications
Debug Assistance: Identify problematic prompt patterns

Research Applications

Hallucination Studies: Systematic analysis of LLM behavior
Prompt Engineering: Optimize prompts for reduced hallucinations
Model Comparison: Benchmark different models' reliability
Safety Research: Study AI safety and reliability patterns

Advanced Configuration

Custom Scoring Weights

detector = HallucinationDetector()

# Modify internal scoring weights
detector._calculate_hallucination_probability = lambda metrics: (
    metrics.get('confidence_inconsistency', 0) * 0.4 +
    metrics.get('factual_density', 0) * 0.3 +
    metrics.get('contradiction_score', 0) * 0.3
)

Domain-Specific Adaptations

# Medical domain
medical_detector = HallucinationDetector()
medical_detector.uncertainty_phrases.extend([
    "consult your doctor", "seek medical advice", "may vary"
])

# Financial domain
financial_detector = HallucinationDetector()
financial_detector.overconfidence_phrases.extend([
    "guaranteed profit", "no risk", "certain return"
])

Troubleshooting

Common Issues

High False Positives

# Lower the threshold
result = detector.analyze_response(response, confidence_threshold=0.8)

Missing Context Issues

# Always provide context when available
result = detector.analyze_response(response, context=original_query)

Performance Issues

# For very long texts, consider chunking
def analyze_long_text(text, chunk_size=1000):
    chunks = [text[i:i+chunk_size] for i in range(0, len(text), chunk_size)]
    scores = [get_hallucination_score(chunk) for chunk in chunks]
    return sum(scores) / len(scores)

Debugging

# Enable detailed metrics
result = detector.analyze_response(response)
print("Detailed metrics:", result.metrics)

# Check individual components
print("Confidence issues:", result.metrics.get('confidence_inconsistency'))
print("Factual density:", result.metrics.get('factual_density'))
print("Coherence score:", result.metrics.get('coherence_score'))

Examples

Example 1: High Confidence Claims

response = """
The Eiffel Tower was definitely built in 1887 and is exactly 324 meters tall. 
It was designed by Gustave Eiffel and cost exactly $1.2 million to construct. 
Without doubt, it receives 7 million visitors every year.
"""

result = detector.analyze_response(response)
# Output: High hallucination probability due to overconfident language

Example 2: Contradictory Response

response = """
Python is always the best programming language for data science. 
However, Python is never suitable for machine learning projects. 
It's impossible to use Python for AI development.
"""

result = detector.analyze_response(response)
# Output: High contradiction score detected

Example 3: Uncertain but Honest Response

response = """
I believe the Eiffel Tower was built sometime in the late 1800s, 
possibly around 1889, but I'm not completely certain about the exact date. 
It seems to be approximately 300 meters tall, though I'd recommend 
checking official sources for precise measurements.
"""

result = detector.analyze_response(response)
# Output: Lower hallucination probability due to appropriate uncertainty

Contributing

We welcome contributions! Here's how you can help:

Areas for Improvement

Additional detection methods
Domain-specific adaptations
Performance optimizations
Test case contributions
Documentation improvements

Development Setup

git clone https://github.com/yourusername/llm-hallucination-detector.git
cd llm-hallucination-detector

# Run tests
python -m pytest tests/

# Run examples
python hallucination_detector.py

Contribution Guidelines

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Inspired by research in LLM reliability and hallucination detection
Built for the open-source AI community
Contributions from developers worldwide

Support

Email: mattbusel@gmail.com

Roadmap

v2.0 (Planned)

v1.5 (In Progress)

Improved accuracy metrics
Custom domain adaptations
Performance optimizations
Extended test coverage

Help make AI more reliable, one response at a time.

Table of contents

LLM Hallucination Detector

New: Token-by-Token Visualizer (Rust MVP)

Features

Quick Start

Library Usage

Repository Structure

Python LLM Hallucination Detector

Quick Start

Table of Contents

Features

Multi-Method Detection

Framework Agnostic

Comprehensive Output

Installation

Option 1: Copy-Paste (Recommended)

Option 2: Clone Repository

Requirements

Usage

Basic Usage

With Context

Convenience Functions

Detection Methods

1. Confidence Pattern Analysis

2. Factual Density Scoring

3. Coherence Analysis

4. Context Consistency

5. Repetition Detection

6. Contradiction Detection

Setup FactGraph

Anthropic Claude Integration

Local Model Integration

Flask Web API Integration

Configuration

Threshold Settings

Custom Patterns

API Reference

Classes

HallucinationDetector

Methods

DetectionResult

Attributes

Functions

quick_hallucination_check(response, threshold=0.7) → bool

get_hallucination_score(response, context=None) → float

analyze_with_recommendations(response, context=None) → Dict

Performance Benchmarks

Speed Benchmarks

Accuracy Metrics

Comparison with Other Methods

Use Cases

Production Applications

Development & Testing

Research Applications

Advanced Configuration

Custom Scoring Weights

Domain-Specific Adaptations

Troubleshooting

Common Issues

Debugging

Examples

Example 1: High Confidence Claims

Example 2: Contradictory Response

Example 3: Uncertain but Honest Response

Contributing

Areas for Improvement

Development Setup

Contribution Guidelines

License

Acknowledgments

Support

Roadmap

v2.0 (Planned)

v1.5 (In Progress)

Table of contents

LLM Hallucination Detector

New: Token-by-Token Visualizer (Rust MVP)

Features

Quick Start

Library Usage

`HallucinationDetector`

`DetectionResult`

`quick_hallucination_check(response, threshold=0.7)` → `bool`

`get_hallucination_score(response, context=None)` → `float`

`analyze_with_recommendations(response, context=None)` → `Dict`

`HallucinationDetector`

`DetectionResult`

`quick_hallucination_check(response, threshold=0.7)` → `bool`

`get_hallucination_score(response, context=None)` → `float`

`analyze_with_recommendations(response, context=None)` → `Dict`