Companies are facing increasing pressure from stakeholders — investors, customers, partners, regulators, and the public — to measure and disclose performance on topics across environmental, social, and governance (ESG) programs. But aligning corporate ESG strategy with those stakeholder expectations and priorities is a challenge.
To do this, once a year, companies undergo what’s known as a materiality assessment — but this exercise can only go so far. It’s limited by mountains of difficult to sort through data, constantly shifting viewpoints, and, typically, constrained resources.
It is incredibly difficult for sustainability teams to pursue initiatives across the wide breadth of ESG topics, including greenhouse gas emissions, workforce diversity, and business ethics, therefore they must focus on the ones that are most relevant and material to their business.
How AI Can Help
AI is a natural solution to keep track of changes in stakeholder priorities over time and widen the aperture of stakeholder signals. One technique, natural language processing (NLP), is particularly well suited to address this problem because it can be used to digest stakeholder documents such as reports, press releases, and engagement guidelines and identify ESG sentiment.
Even with NLP, building an AI application that ingests and processes data from ESG documents in a standardized way is a difficult task. There are many questions to ask when designing the technical features of this application, but they can all boil down to one: How do we measure the extent to which a document discusses a certain ESG topic?
This was a critical question that the C3 AI data science team needed to answer while building the C3 AI ESG application. The outcome: A five-step process that quantifies ESG stakeholder materiality.
- Data Ingestion: Web scraping to capture and store all relevant stakeholder documents.
- Data Preparation: Document cleansing and parsing to digest input documents into NLP-ready paragraphs.
- Paragraph-Level Analysis Pipeline: Identifying ESG topics and scoring materiality in paragraphs with NLP.
- Aggregate Materiality: Machine-learning (ML) aggregation of ESG materiality scores across documents and stakeholders, all weighted over time.
- User Insights: Exposing actionable AI recommendations with visuals and alerts.
Although data ingestion and preparation are critical to the success of the NLP model, the novel piece of this technology is in the paragraph-level analysis pipeline — shown as step three in the workflow diagram. This is when the machine learning NLP pipeline determines whether and to what extent a paragraph discusses an ESG topic.
This pipeline is an ensemble model built from a combination of four different components rolled into one, each designed to make a decision about the content of a paragraph that culminates in a final prediction and answer the question: does this paragraph discuss a specific ESG topic?
That decision is then followed by a post processing step to ensure the model is considering how ESG topics overlap and are connected. For example, say the model is analyzing a report that is mentioning multiple ESG topics: greenhouse gas emissions and biodiversity. Those topics are, at minimum, not only related to each other, but also fall under the umbrella of more comprehensive ESG topics, including climate. But there is a chance that climate is not mentioned in the report even though GHG emissions and biodiversity are being discussed under this umbrella topic. In that case, this post processing step would inflate the score it is giving the topic of climate to ensure it is being weighted fairly in the model. That is what the post processing step solves for; it works to create an understanding of how the topics relate hierarchically in whichever document it is analyzing.
Now, before diving into the ensemble model components, let’s first discuss the underlying NLP models to understand how their underlying mechanisms informed our engineering of the paragraph-level pipeline.
How C3 AI ESG Uses Large Language Models
The NLP approach for ESG stakeholder materiality leverages two publicly available large language models (LLMs): a general and domain-specific BERT model. These LLMs are used to generate text embeddings that represent the definition and semantically related words for a given ESG topic. The model leverages the embeddings to identify when and to what extent an ESG topic is discussed in a paragraph. Text embeddings are numerical representations of text, and similar embeddings represent text content with similar meanings.
Text embeddings from both LLMs are leveraged in the NLP pipeline:
- General Embedding: Produced by a general BERT model that is effective at capturing semantic similarity between sentences and short paragraphs.
- Domain Embedding: Produced by a domain BERT model that is particularly strong at capturing the ESG-specific content of the text.
Building the Unique Ensemble Model
The ML pipeline takes paragraph text data as inputs and produces a prediction that indicates whether an ESG topic is discussed in the paragraph. It also produces a score representing how similar the paragraph is to a “centroid” (domain embedding) that defines the ESG topic. Both outputs are produced on a per topic per paragraph basis because each paragraph can discuss multiple ESG topics.
The pipeline operates at a paragraph-level because of the token length limitations of LLMs, and because a paragraph is the shortest component of a document at which the presence of an ESG topic can be measured without losing context.
Building such a complex ensemble NLP pipeline is enabled by the C3 AI Platform architecture for composing ML pipelines. The pipeline is designed to match the performance of human ESG experts by leveraging four component models. The key term–based prediction uses expert knowledge, defined through ESG topic key terms, to perform well in low-data environments. The centroid-based prediction uses labeled data to continuously enhance model performance over time. Together, the ensemble pipeline applies LLMs to produce predictions 100x faster than expert labelers and is extensible to new ESG issues as they arise.
The paragraph-level prediction is produced by a ML model that combines the strengths of rule-based and NLP-based methods. By leveraging predictions from four component models, this ensemble model achieves higher overall model performance.
In order to make key term search successful, a text processing pipeline pre-processes both the paragraph text and key terms, including punctuation removal, extra blank space removal, lowercasing, and lemmatization. The processing pipeline ensures that small differences in the way that a term is denoted do not impact whether a matching set of key terms is identified.
The key term search logic is configured per ESG topic and consists of a Boolean logical expression that combines key terms groups. An illustrative example for Customer Privacy is below:
Boolean Expression: g0 | (g1 & g2)
g0: private data, personally identifiable information, PII, privacy, GDPR
g1: leak, breach, hack
g2: accounts, users, customers
In this case, a paragraph is given a positive prediction if it contains any of the key terms in g0, or at least one key term in g1 and at least one key term in g2.
During model training, all training paragraphs are given “weak labels” that are defined as the results of the key term search pipeline. These weak labels and the corresponding general embeddings of those paragraphs are saved in the model, which is why the approach is considered weakly supervised. During inference, the model identifies the paragraphs in the training set with the nearest general embeddings to the input paragraph. If a high enough proportion of these identified training set paragraphs have positive weak labels, the input paragraph is predicted to discuss the ESG topic.
To recap key term–based prediction: if both component models yield positive predictions, then the ensemble model overall predicts that the paragraph discusses the ESG issue. The next section discusses the other half of the ensemble model: Centroid-Based Prediction.
The ensemble model yields a positive prediction using the following Boolean expression based on the component model predictions: 1 & 2 | 3 & 4. In other words, the ensemble prediction is true if the predictions for 1 and 2 are true and/or if the predictions for 3 and 4 are true.
The key term–based prediction indicates that a paragraph discusses a topic if it contains the key terms of the topic and has a general embedding similar to those of other paragraphs that contain the key terms. The centroid-based prediction indicates that a paragraph discusses a topic if its domain embedding is both similar to the domain embedding of the topic definition and similar to positively labeled training paragraphs.
Ensuring Deduplication with Post Processing
Some ESG topics are, by definition, supersets of others. Once the ensemble model processes all input paragraphs, there is a post-processing pipeline to double check predictions for parent and child ESG topics. The pipeline handles these cases by predicting a “parent topic” as present in a paragraph if any of its “child topics” are discussed in the paragraph. The pipeline calculates the embedding similarity score for a parent topic as the maximum of its similarity score and the maximum similarity score of its children.
For example, “Climate” is defined as a parent ESG topic of “Greenhouse Gas Emissions.” If a paragraph is predicted to discuss “Greenhouse Gas Emissions” but not “Climate,” the post-processing pipeline will give it a positive prediction for both and will set the embedding similarity score for “Climate” to be the maximum of the similarity scores for “Climate” and “Greenhouse Gas Emissions.”
The Result: A Powerful AI Application for ESG Leaders
The C3 AI ESG application leverages NLP to surface insights on stakeholder ESG materiality for an organization, removing the guess work and high volume of manual effort required. To quantify stakeholder materiality, the AI-enabled application uses a novel data science approach to continuously monitor and evaluate discussion of key ESG topics in published stakeholder documents — and persists the results in a time series so that changes in stakeholder priorities can be evaluated over time.
In the second half of this series, we will explore how C3 AI ESG uses the information extracted from stakeholder documents to surface insights, and how ESG leaders can use those insights to develop strategic goals and actionable plans.
About the Authors
Robert Young (author) is a Manager in the Data Science team at C3 AI, where he develops machine learning and optimization solutions for sustainability, supply chain, and predictive maintenance problems. Prior to C3 AI, he built AI systems for smart buildings. He holds a MS in Engineering from Stanford University.
Jessica Matthys (editor) is a Product Manager at C3 AI, working on the C3 AI Sustainability Suite. Prior to C3 AI, Jessica worked in energy and sustainability at Tesla and Accenture. She has an MBA from the Kellogg School of Management at Northwestern University and a BSE in Mechanical Engineering from Duke University.
Thank you to the C3 AI ESG Data Science team, including Hang Le and Suvansh Dutta, for their contributions to this blog.