ESG Data

Alternative Data Category Description

‘ESG’ (Environmental, Social, and Governance) is a term that has become increasingly popular with investors and asset managers over recent years. Interest in ESG has grown significantly as the number of ESG-focused mandates has rapidly grown in the asset management industry. Furthermore, studies have indicated that COVID-19 has increased investor sensitivity to ESG issues, accelerating the ESG investing trend. With the introduction of new sustainable finance disclosure regulations and taxonomies from the UN and across the EU, North America, and Asia, it is more important now than ever to understand ESG data and where investors can source reliable ESG alternative datasets in order to understand the impact of their portfolios and offerings across their sustainable investing goals.

ESG is based on three main pillars and each pillar’s materiality differs across regions, companies, industries, and even asset classes. Broken down into Environmental, Social, and Governance, ESG factors are not only attributable to public equities but also across fixed income, private equity, infrastructure, real estate, and so on. Another factor leading to the increased interest in ESG-related alternative data is scrutiny of traditional ESG rating providers which are facing various critiques in recent times. Critiques such as inaccurate data, biases, inconsistencies, lack of clear methodologies, lack of granularity, and lack of latency have paved the way for investors to seek out sources of alternative data for ESG investing.

Technological innovation has paved the way for alternative data aggregators to collect ESG-focused data from disparate sources such as company disclosures, news articles, customer ratings/reviews, company websites, and social and professional networking platforms. Aggregators commonly web scrape such data sources and aggregate the data using entity recognition, AI (artificial intelligence), and NLP (natural language processing) tools to identify key metrics and concepts. Furthermore, innovation in the geospatial industry has allowed data collection through satellites and thermal imagery to identify key emitters and environmental issues such as biodiversity loss, hazardous waste disposal, deforestation, etc. In terms of the Social pillar, scraping of job boards, employee reviews, and corporate websites has allowed investors to identify the employee makeup of companies, providing insights into a company’s diversity, inclusion, and engagement efforts. On the other hand, AI and ML tools have helped investors scan through numerous company documents and financial disclosures to identify accounting issues and risk of fraud from a Governance perspective.

The frequency of ESG data ranges from daily to monthly and even quarterly in some cases depending on the underlying source of the data. The data is often at a ticker and company level, but can also be aggregated on a sector, industry, and region/country level in order to gauge macroeconomic developments, sovereign ratings, and industry trends for top-down analysis. ESG data is usually priced in the range of $40k to $100k depending on the coverage, history, and latency of the data, along with the underlying source. The majority of the alternative data providers cover the Americas and Europe regions, however, there has been a growth in the number of data vendors with coverage on emerging markets in recent years.

Subcategory - News NLP Data

Another alternative data source for evaluating a company or asset’s ESG metrics can come from sentiment analysis derived from news and social media feeds as well as corporate filings. Many vendors specialize in turning unstructured news/social media text data into structured files for sentiment analysis using Natural Language Processing (NLP) technologies. Analyzing news articles, journals, and trade publications can help investors derive a sentiment towards a particular firm-level issue such as labor rights, wastewater treatment, etc., and even the entire company itself. On the other hand, analysis of public opinions posted on Twitter, Reddit, Facebook, and other social network sites, along with reviews & rating sites, can provide an overall view of a consumer’s perspective towards a company or a specific issue a company is facing. In addition, NLP and sentiment analysis could also be done on company reviews and ratings provided by employees on sites such as Glassdoor, OpenWork, etc. can provide insight into a company’s culture, workforce issues, employee satisfaction, and so on, thus providing a more granular and frequent view into a company’s ESG initiatives.

Subcategory - ESG Raw Data

ESG issues may be monitored using a variety of new alternative data sources. These include social media, satellite, and open and public data sources.

These alternative datasets are raw in nature in the sense that they are not quantified or aggregated into a score or rating based on a certain ESG framework or methodology. ESG raw data could be obtained from satellite imagery and IoT sensors to track carbon and GHG emissions, deforestation activity, air quality, and also energy, waste, and water management. Open and public alternative data sources help collect raw data on the Social pillar such as a company’s employee makeup to assess employee seniority, gender gaps, compensation gaps, etc. These datasets provide insight into the investee company’s social and human capital. Social capital measures the expectations that the business will contribute to society at large. The human capital component focuses on how labor practices impact an employee’s health and safety.

Employee review and sentiment data can be offered as aggregated ratings given to companies across different sectors by former and current employees as well as the number of 1-5 star reviews. Metrics can include review, author title, company pros, and cons, recommended value, outlook value, CEO/senior management review, rating, career opportunities, work-life balance, culture, etc. Employee satisfaction presents an insight into a company’s Social and Governance factors within the broader ESG landscape. A high employee satisfaction score can lead to increased sales growth whereas a low employee satisfaction score and lack of health and safety measurements can lead to higher attrition, resulting in loss of business and competitive advantage.

Due to the lack of consistency and accuracy across ESG ratings and scores provided by third parties, many investors are now leaning toward generating ESG scores themselves based on in-house methodologies. This is where ESG raw data adds value to investors as they can input these raw datasets into their models to generate an overall ESG score or use a certain component of ESG data as a key factor in their analysis based on its materiality. ESG raw data also helps investors identify greenwashing in companies wherein a company’s reported metrics on their ESG initiatives differ from the ground truth.

Subcategory - ESG Scores Data

An ESG score is a measure of a company's exposure to long-term environmental, social, and governance risks that are often overlooked during traditional financial data analyses. These risks include factors such as energy efficiency, worker safety, and board diversity to name a few, all of which can have significant operational and financial consequences. A strong ESG score/rating indicates that a company manages its ESG risks well in comparison to its peers, whereas a poor ESG score/rating indicates that the company has relatively higher ESG risk exposure. Given that the space is now saturated with several data companies providing ESG scores, some of the prominent ESG score alternative data providers are very well know by now.

ESG rating alternative data providers have developed individual criteria for assessing the ESG performance of companies based on data gathered from a variety of sources such as company filings, voluntary business disclosures, governmental databases, academic research, and media reports. ESG scores/ratings are typically based on voluntary disclosure frameworks as well as regulatory frameworks, such as the Global Reporting Initiative (GRI), SASB, CDP, UN Sustainable Development Goals (SDGs), SFDR, etc.

Data Structure

  • Most alternative data vendors offer the data mapped to ticker or company and PIT.
  • History can vary. Typically, at least three to five years of history.
  • Delivery frequency can be weekly going up to monthly and annually depending on the underlying alternative data source. There is usually a week or month lag in the data.
  • The data is delivered by API, FTP, and S3 and many providers have also developed platforms and UI that non-quant or fewer data adept funds can use to assess trends and KPIs.

Compliance Considerations

It is important to consider compliance factors such as Material Non-Public Information (MNPI) and Personal Identifiable Information (PII) when evaluating ESG datasets. In addition, other compliance factors need to be considered depending on the underlying source of the ESG data which would vary based on each individual dataset.

Most ESG datasets scrape corporate filings and websites, social networks, or government websites to collect pertinent data points. Users of the data need to ascertain if the data vendor collected the data legally and complied with all the conditions on a company’s and/or social network website. In general, web scraping is considered legal as long as the information scraped is publicly available and not restricted behind a paywall.

Spread the word