Detect whether a Hindi news headline is Fake or Real.
VeriNews is a lightweight yet high-accuracy system for detecting Fake news in Hindi headlines. By fine-tuning DistilBERT on a carefully curated corpus of labeled Hindi news, the model achieves reliable real-time predictions while keeping memory and compute costs low enough for commodity hardware. A Streamlit front-end makes the detector accessible to non-technical users, allowing journalists, fact-checkers, and the public to test headlines instantly and help slow the spread of misinformation.
The rapid growth of online news, social media, and messaging apps in India has created fertile ground for misinformation. English-centric fact-checking tools struggle with Hindi, which remains the primary language for millions of readers. VeriNews addresses this gap by:
Figure 1. Use Case Diagram β VeriNews User Interaction
This diagram illustrates how different users interact with the VeriNews system. Users can input Hindi news headlines through either the Command Line Interface (CLI) or the Streamlit-based Web Interface. The system processes the input and returns a prediction β real or fake.
High-precision classification
Achieve β₯ 90 % F1-score on held-out Hindi headline data while maintaining low false-positive rates.
Low-latency inference
Keep end-to-end prediction time under 300 ms on a single CPU core to enable real-time use.
Resource efficiency
Limit GPU memory footprint to β€ 2 GB during training; CPU-only inference after deployment.
User-friendly workflow
Provide both CLI and Streamlit interfaces plus clean, reproducible code (requirements, scripts, notebooks).
Figure 2. Workflow Diagram β VeriNews System Pipeline
This diagram outlines the internal processing flow of the VeriNews application. It starts with user input, applies Hindi-specific text preprocessing, passes the data to the DistilBERT-based classifier, and displays the prediction via the chosen interface.
Property | Details |
---|---|
Source | Aggregated from multiple Hindi news portals and fact-checking sites (see /dataset/README.md). |
Size | 40 k headlines (21 k real, 19 k fake) |
Time span | 2016 β 2023 |
Labelling process | Cross-verified by human annotators; fake samples validated against certified fact-check portals (AltNews, BOOM, PIB Fact-Check). |
Pre-processing | De-duplication, Unicode normalization (NFKC), removal of non-Devanagari punctuation, custom Hindi stop-word list, stemming, and WordPiece tokenization (from transformers library). |
Train / Val / Test split | 70 % / 15 % / 15 % (stratified by label) |
VeriNews uses DistilBERT distilbert/distilbert-base-multilingual-cased for classifying Hindi news headlines as real or fake. DistilBERT is a lighter, faster version of BERT that retains most of its accuracy while reducing model size and inference timeβideal for real-time applications.
Key Details:
DistilBERT strikes a balance between speed and accuracy, making it well-suited for a public-facing Hindi fake news detector.
Python
dataset
directory of the project.pip install -r requirements.txt
Make sure you have Python installed and a virtual environment activated to avoid dependency issues.
script.py
file and train the model (if not already).python script.py
streamlit run main.py
Figure 3. UI built on Streamlit showing the Homepage.
Figure 4. Snapshot of the dataset.
Figure 5. User Interface showing output as Fake News
Figure 5 shows the User Interface with the output classified as Fake News. Once the user submits a news article, the system analyzes and determines it as potentially misleading or false. This output allows users to identify and avoid spreading unverified or deceptive content.
Figure 6. User Interface showing output as Real News
Figure 6. demonstrates the User Interface of the Hindi Fake News Detection System displaying the output as Real News. After the user inputs a news article, the system processes and classifies it, indicating that the content is likely credible. The interface provides a clear result, helping users verify the authenticity of the article.
VeriNews addresses the critical challenge of detecting fake news in Hindiβan underserved area in the misinformation landscape. By combining the efficiency of DistilBERT with Hindi-specific preprocessing and an accessible interface, the system offers both technical rigor and practical usability. Whether used by journalists, researchers, or everyday readers, VeriNews empowers users to verify the credibility of Hindi news headlines quickly and reliably.
Future enhancements could include multilingual expansion, and integration with real-time news sources for live detection.