Alternative Data Category Description
Technological innovation has allowed alternative data providers to collect alternative data from disparate alternative data sources and aggregate that data in a format that is helpful for asset managers, hedge funds, and private equities alike. One such category of alternative data is Employment Data which is commonly web scraped and collected from publicly available sources such as job boards, professional networking sites, employee review platforms, IRS filings, company disclosures, and news and trade publications. Employment data is relatively well-established and widely used but continues to see significant interest from buyers.
Alternative employment data in the form of job postings, employee sentiment, headcount, and to an extent wages are often used in the analysis and evaluation of corporate strategy, industry growth, demand for specific skillsets, employee turnover rates, and salesforce growth to name a few use cases. Listings of job postings can be used to evaluate corporate strategy and direction, industry growth rates, and demand for specific skills. For example, is the demand for candidates with experience in Tableau or Google AdWords, growing or plateauing?
Various alternative data providers scrape LinkedIn and other professional networking sites to track changes in corporate employees, allowing analysts to identify companies with high employee turnover rates or companies with strong growth in software engineers or salesforce personnel. Another example of employment alternative data comes in the form of employee reviews through platforms such as Glassdoor and Blind, which can be accessed and parsed for valuable company insights like employee sentiment, employee satisfaction, and expected future attrition. Employment data can also provide insights into a company's diversity, inclusion, and governance efforts through analyzing data on employee/board gender makeup and compensation gap for ESG investing analysis.
The frequency of employment alternative data ranges from daily to monthly and even quarterly in some cases depending on the underlying source of the data. The data is often at a ticker and company level, but can also be aggregated at a sector, industry, and region/country level in order to gauge macroeconomic developments and hiring trends. Employment data is usually priced in the range of $40k to $100k depending on the coverage, history, and latency of the data, along with the underlying source. The majority of the employment alternative data providers cover the Americas region, However, there has been a growth in the number of vendors with coverage of emerging markets in recent years.
Subcategory - Human Capital Data
Human capital data is a broader sub-category of alternative data within Employment data given that the space is now saturated with various data providers. Human capital data capture employment data such as employee profiles & filings, employee reviews, and sentiment.
Employee profile data requires aggregating millions of employment profiles, from public and paid sources, partnerships, and internal research. Metrics include monthly time series of personnel employed by, joining, and leaving a company. Counts are further broken down by gender, tenure, and skills distribution. Use cases for employee profiles data include derivation of trading signals (high attrition rates, stagnant promotion schedule), evaluating different workforce allocations between competitors and identifying skill gaps, predicting operating expense , and monitoring the implementation of a strategy (attracting and retaining employees within a strategic role/skill area). On the other hand, employee filings data includes disclosures submitted by companies in the US through government agencies. All metrics are based on verified & audited filings submitted by the companies themselves to the IRS and Department of Labour. The filings pertain to employee benefit plans, pension plans, etc. which include retirement (e.g., 401k) and welfare (e.g., healthcare) data.
Employee review and sentiment data can be offered as aggregated ratings given to companies across different sectors by former and current employees as well as the number of 1-5 star reviews. Metrics can include review, author title, company pros, and cons, recommended value, outlook value, CEO/senior management review, rating, career opportunities, work-life balance, culture, etc. Employee satisfaction presents an insight into a company’s Social and Governance factors within the broader ESG landscape. A high employee satisfaction score can lead to increased sales growth whereas a low employee satisfaction score and lack of health and safety measurements can lead to higher attrition, resulting in loss of business and competitive advantage.
Subcategory - Job Postings Data
These alternative datasets offer daily job posting activity, tagged with the associated company, and even a ticker symbol. Different altenrative data providers cover different geographical regions. Some of them scrape directly from corporate websites, others from job boards, and deploy a methodology to remove duplicates and expired listings to ensure data quality. Some vendors include additional data elements such as company identifiers, industry classification codes, and Standard Occupational Code (SOC) at the individual job level. On average, alternative data vendors can cover 20 different data attributes around the job such as title, company, job category, title, location, created date, and the full job description. When thinking about applications of these datasets: Analysing job listings data can reveal insights into company fundamentals.
An increase in newly created job listings and total job listings can mean growth in revenue, while the type of jobs advertised can indicate areas of strategic importance for a company. Hiring rates can indicate changes to production at factories or a slowdown/increase in revenue growth within companies. Changes in hiring locations can indicate a company’s expansion or retrenchment plans. It can also signal an increased focus on a particular technology/ or things like legal and compliance. A reduction in job postings for more traditional roles has been a strong leading indicator of a company’s restructuring plans.
Data Structure
- Most datavendors offer the data mapped to ticker or company and PIT.
- History can vary. Typically, has at least seven years of history.
- Delivery frequency can be daily/weekly going up to monthly depending on the underlying alternative data source. There is usually a week or month lag in the data.
- The data is delivered by API, FTP, and S3 and some providers have also developed platforms and UI that non-quant or fewer data adept funds can use to assess trends and KPIs.
Compliance Considerations
Most employment alternative data sources scrape corporate, social network, or government websites in order to collect job postings and human capital data. Users of the data need to ascertain if the data vendor collected the data legally and complied with all the conditions on a company’s and/or social network website. The hiQ v LinkedIn case was a ground-breaking court case related to web-scraped data specifically related to employment data scraped from LinkedIn. This case spent many years in the US court system. It originally looked like hiQ won the case and thus scraping data was deemed legal. However, scraping data is only legal if conducted on a public-facing website with no login used to access the data. The profiles on LinkedIn are public and therefore can be scrapped.
It turned out that after many years in the court system, hiQ admitted to scraping behind a login and lost the case. The ruling leaves open a broad interpretation of scraping in general and vendors must adhere to company website T&Cs, the use of rotating IP addresses, and robot.txt restrictions. Specifically, for hiQ the ruling said that if a vendor has their own corporate profile on LinkedIn, then they need to comply with the T&Cs of LinkedIn, which does not allow scraping. The restrictions from the ruling have yet to be seen to play out beyond hiQ. The broad hiQ ruling may lead to more cease-and-desist letters and make compliance and data provenance more difficult for any data derived from web scraping. We also saw in 2023 Meta sued Bright Data for scraping data from Facebook and Instagram, alleging that its scraping breached Facebook and Instagram’s terms of service and is thus a breach of contract. The court found that Meta's terms do not prohibit scraping of public data while not logged in. Like everything in the alternative data ecosystem, data provenance is of the utmost importance.