Alternative Data Category Description
B2B data is alternative data that is gathered by data aggregators about companies and how companies interact and transact with each other. Aggregators gather data from various sources and combine that data in a manner that is useful for companies to use themselves for competitive analysis, devising marketing and go-to-market campaigns, and generating prospects and sales leads. A derivative of the data can also be useful for asset managers. For asset managers, good B2B data is very desirable but it is difficult to come by data that has good investment use cases.
Gathering relevant data can be complex and time-consuming as the data is pulled from various sources. Typically, the data is gathered by web scraping company websites, and employment sites like LinkedIn, getting access to email accounts, and using publicly available data at state and local levels such as tax filings. Aggregating all this data, deduping, cleansing, and validating it can take time and resources. B2B vendors perform this exercise and sell the data to companies to use, but insights can also be gathered from the data for investment purposes.
Subcategory - Firmpgraphic Data
Data about firms, also known as firmographics, is information that quantifies as well as defines the features of businesses (i.e., companies). These factors can include the size of the company, the industry it operates in, its legal standing, tax ID, and revenue estimates. The dataset can also contain PII in the form of emails and phone numbers and information on senior management employment history.
Most data vendors also have details on total employee count and growth in headcount over time. Firmographics is a similar concept to demographics but rather than concentrating on describing and classifying people, firmographics describes and classifies businesses.
The usage of firmographic data can be of assistance in helping investors to learn more about a business or sector you are interested in researching. This type of data can be particularly valuable to private equity funds for deal sourcing and due diligence. From an asset management perspective, the data can be used to evaluate an industry and for competitive analysis in relation to public companies in that industry.
Subcategory - Supply Chain Data
A holistic view of B2B supply chain data enables capturing and tracking business interactions and transactions between companies. How are companies connected from supplier to customer and a competitive matrix? Most vendors approach this topic using company-disclosed data. The most robust supply chain datasets are for public companies that disclose in various filings and press releases who their customers and suppliers are. The supply chain vendor then puts all the linkages together in a data offering. Factset and Inferess fall into this category of data vendor. Another novel way of making supply chain connections is used by a company called Predik Data which used geolocation data to build out its dataset. Predik monitors geolocation or footfall traffic between companies and infers a connection between the companies. For obvious reasons, this works best for the delivery of physical goods rather than services.
Subcategory - Technographic Data
Technographic data is information about the hardware and software tools used in company processes. Similar to the ways in which "demographics" and "firmographics" classify, analyze, and target consumers, "technographics" does the same for businesses using certain technology and software in a given market. This method is commonly known as technological segmentation.
Vendors usually scrape corporate websites for indications of the use of certain technology, databases, and software. This is enhanced by looking at job postings and social media postings. This employment data is used to infer the use of certain technologies and the level of adoption at a particular company of certain technologies. This detail can show the use and adoption of established technology companies that are public and also companies that are new to the market and private.
The public companies are usually mapped to tickers in the dataset. In short, the term "technographic data" refers to information about an estimate of the technology stack that is being used by a company at a given point in time. Over time this can indicate trends for certain products within this technology stack.
Data Structure
- Data is usually mapped to the company but it may not be PIT.
- Depending on the data source, the delivery frequency can range from daily to weekly to monthly. Usually, the data is delivered in a week or by month.
- History can vary and typically dates back at least five years with some vendors up to 10 years.
- API is most common to access a quantity of data or a number of companies required. FTP and S3 are common, and platforms/UI are available from some vendors.
Compliance Considerations
Firmographic data can contain a lot of Personally Identifiable Information (PII) so rules, including GDPR, around PII need to be observed. From an asset management perspective, any PII should be removed from the rest of the firmographic data. PE rules may be less restrictive on this point and many PE firms use senior management information on roles and previous positions as a signal as part of deal sourcing. The majority of B2B data is scraped from sources that are open to the public and include things like business websites, registers, tax returns, and employment websites.
Because the information is considered to be part of the public domain and because it is not necessary to enter a password in order to view the information, it can be argued that web crawling falls within the realm of permissible activities. However, a user of the data needs to endure that the vendor obtained the data legally and complied with all website T&Cs, and most importantly did not scrape anything behind a login or paywall. B2B data could in theory provide Material Non Public Information, but it is unlikely. The alternative data is public so the main thing to adhere to is data provenance and making sure the alternative data is acquired legally.