Enhancing ESG Data Integrity Using OpenAI’s GPT for NLP-Based Confidence Scoring

Abstract

Environmental, Social, and Governance (ESG) reporting plays a pivotal role in sustainable finance and corporate accountability. However, ensuring the reliability of ESG responses remains a challenge due to inconsistent disclosures and subjective analyst assessments. This study presents an AI-driven confidence scoring model leveraging OpenAI’s GPT and cosine similarity-based NLP embeddings to validate ESG-related responses. Using pre-trained text embedding models (text-embedding-ada-002), the system computes semantic similarity between new ESG disclosures and verified training data to enhance the confidence assessment process. Experimental results demonstrate that the proposed approach effectively automates ESG verification, improves transparency, and minimizes analyst subjectivity.

1. Introduction

The increasing global focus on ESG (Environmental, Social, and Governance) compliance has led corporations to disclose sustainability reports and governance-related practices. These disclosures influence investor decisions, corporate policies, and regulatory frameworks. However, verifying the credibility of ESG reports remains a challenge due to subjective assessments and inconsistent data sources.

Existing ESG verification methods rely on manual analyst reviews and heuristic-based scoring, making them prone to biases and inefficiencies. This paper introduces an NLP-powered confidence scoring system that employs OpenAI’s GPT model to evaluate corporate ESG responses dynamically. The system uses semantic embeddings and cosine similarity to compare new disclosures with historical data, providing objective confidence scores to improve ESG data reliability.

2. Related Work

Several methodologies have been explored in ESG data validation:

Rule-Based Heuristics: Prior approaches use predefined keywords (e.g., "verified," "claims," "weak evidence") to assign confidence scores. These methods lack adaptability to semantic variations in ESG disclosures.
Machine Learning (ML) Methods: Some studies have employed decision trees and logistic regression models for ESG risk assessment, but they require extensive feature engineering.
NLP and Transformer Models: Recent advancements in NLP, particularly BERT and OpenAI’s GPT models, have significantly improved text classification and credibility assessment. These models offer context-aware embeddings that outperform traditional ML-based ESG validation approaches.

Our proposed system builds upon NLP-based text embeddings and GPT-powered responses to deliver scalable, explainable, and highly accurate ESG confidence scores.

3. Methodology

3.1 Dataset and Pre-processing

The dataset consists of three primary components:

1. ESG-related Questions: General sustainability queries directed at companies.

2. Company Responses: Corporate disclosures regarding their ESG policies and practices.

3. Analyst Comments: Expert feedback evaluating the credibility of the provided ESG response.

Data from two sources (Confidence Score_V1.xlsx and Confidence Score_V2.xlsx) were merged to create a training dataset.

A combined text feature was generated for each data point:

"Question: | Company Response: | Analyst Comment: "

This combined feature is used for semantic embedding generation.

3.2 Text Embedding & Cosine Similarity

To compute textual similarity, the system employs (OpenAI’s text-embedding-ada-002 model), which converts text into dense vector representations.

Embedding Calculation:

Screenshot 2025-03-13 100004.jpg
where
X represents the combined ESG text, and E is the 384-dimensional vector representation.

Similarity Score Calculation:

To find relevant training examples, we compute the cosine similarity between new ESG disclosures and historical responses:

Screenshot 2025-03-13 101051.jpg

The top 10 most similar responses are retrieved to provide contextual validation for new ESG statements.

3.3 Confidence Scoring Using GPT

After retrieving relevant ESG training data, a custom GPT-based assistant (fine-tuned for ESG evaluation) dynamically assigns a confidence score (0-100) based on:

Alignment with historical verified data (via cosine similarity)
Presence of analyst validation keywords (e.g., “verified,” “weak evidence”)
Consistency in company disclosure structure

Each ESG response is processed by GPT-4, which generates:

A numerical confidence score (0-100)
A brief textual explanation for the assigned score

4. Experiments & Results

4.1 Model Implementation

The system was implemented using Python with the following core libraries:

OpenAI API (text-embedding-ada-002, GPT-4)
Scikit-learn (cosine similarity computation)
Pandas & NumPy (data handling)

A real-world dataset of ESG-related disclosures was used for validation.

4.2 Performance Evaluation

We evaluated model performance using two key metrics:

Alignment with Human Analysts

We compared GPT-generated confidence scores with human-assigned scores. The results showed an 82% agreement rate, indicating that the AI model closely follows human analyst reasoning.

Reduction in Manual Review Time

Traditional analyst scoring time: ~12 mins per ESG response
AI-based scoring time: ~3.2 secs per response
Efficiency Gain: 220x faster than manual review

4.3 Qualitative Analysis

To assess interpretability, domain experts manually reviewed 50 AI-generated explanations.

90% of explanations were deemed logical
8% required slight modifications
2% were flagged as unclear

5. Conclusion and Future Work

This paper presents an AI-driven ESG confidence scoring system leveraging OpenAI’s text embeddings and GPT-based reasoning. Our results indicate that the model significantly improves data verification efficiency, minimizes subjective biases, and enhances ESG transparency.

Future Work:

Expand dataset size to cover industry-specific ESG disclosures.
Fine-tune GPT models for ESG-specific language understanding.
Incorporate sentiment analysis to detect misleading ESG disclosures.

This research marks a step towards automated, AI-driven ESG credibility assessments, reducing manual workload and increasing transparency in corporate sustainability reporting.

References

OpenAI: GPT-4 Technical Report
Sustainability Accounting Standards Board (SASB)
Environmental, Social, and Governance (ESG) Investing Guide
BERT for Text Classification – Devlin et al. (2019)