Alternative Data Category Description
Consumer transaction data is one of the most popular and widely deployed types of alternative data by investment managers. This type of alternative data is also used extensively by Private Equity and corporates with exposure to the retail segment of the economy. Quants and discretionary managers typically deploy consumer transaction data to predict quarterly revenue growth and earnings. As the data is available before quarterly earnings are released, it is ideal to gauge if a company is going to beat or miss Wall Street estimates.
However, consumer transaction data can also be used by long-term investors to gain insights into consumer purchasing behavior and acceptance of new products and evaluate the implications of promotions and discounting. There are four primary alternative data sources of consumer transaction data, merchant-level data (credit & debit cards), email receipt data, point-of-sale data, and more recently eCommerce data or Fintech apps that have access at the basket level to online sales. As the data is almost real-time in nature it can give a very strong read on how a consumer-facing company is performing. The data can be aggregated up to a sector level or it can be granular at a merchant or company level and sometimes product level data. Given the granularity offered by consumer transaction data, it can be costly to acquire. Some credit & debit card data vendors have sold their data for $1m - $2m annually, most are sold around the $500k mark.
Few funds can afford such price points. In response to this, some alternative data providers have started to sell their alternative data at a ticker level or for a basket of tickers, rather than requiring clients to purchase the whole dataset. Ticker bundles are typically in a range of $30-50k. This has given many more funds access to the data to help answer research questions and to use it for a select number of investments in a portfolio.
Subcategory - Credit & Debit Card Data
This subcategory of alternative data consists of credit and debit card data and can also include data from Fintech Apps, gift cards, and rewards cards. The data is called merchant data as the transactions are rolled up to the company level, for example, it could be Costco or McDonald's. There is no SKU-level data on products that are sold or details such as discounting. The data can be used to model top-line revenue growth and estimate if a company is going to beat or miss consensus for the quarter. Some datasets offer a deeper analysis of cross-shoppers, relative market share, churn, and customer loyalty of companies within a consumer segment of the market. The price of this type of data can range from $500k to well over $1m depending on how much data and how granular you require it.
Subcategory - Email Receipt Data
Consists of data taken from an opt-in panel of consumers’ email accounts. The vendor supplies a service or an email app to the consumer in return for access to the data. As such, it should be compliant with consumer permissions and rights to use the data. The data can be at the SKU or product level. The data can be very granular and where the panel size is robust enough the data can be very insightful to monitor product growth trends, market share, and traction of new product launches. The data can be complementary to merchant-level transactions. It should be noted that Amazon stopped sending itemized e-receipts in the US and Europe, only a total purchase amount is now given. Well-known data vendors aggregate the data and provide PII (Personally Identifiable Information) compliant insights at the product, brand, company, and ticker level. Delivery of data of a bulk feed is more than $250k while ticker bundles could cost $40-50k. Email receipt data is harder to source in Europe, but there are some alternative data vendors offering this now.
Subcategory - Point of Sale Data
As the name implies, this data monitors the sale of products at source. When products are delivered to a store location, they are scanned by digital devices that in turn store the data in a database. POS (Point of Sale) is prevalent across CPG and FMCG goods where the digital measurement of goods is down to the product and individual SKUs, for example, 1Ltr or 300ml cases of water/soda. There are POS vendors in countries across Asia that can give valuable insight at a reasonable cost to CPG/FMCG goods. Datasets typically cost in a range of $50-100k.
Data Structure
- Most vendors offer data mapped to ticker and PIT.
- History can vary. Typically, it has at least seven years of history.
- Delivery frequency can be daily or weekly. Merchant-level data is normally daily with a T+4 to T+ 6-day lag. Email receipt data is delivered daily with a T+1 day lag. For some POS vendors, the data is monthly delivery.
- Some of the larger merchant data providers have platforms and UI that non-quant or fewer data adept funds can use to assess trends and KPIs. POS and Email receipt data usually do not have platforms or UI but do offer to provide curated views of the data that are easier to ingest.
Compliance Considerations
Compliance considerations are most important for merchant-level data and email receipt data. The major concern for these two sub-categories is Personally Identifiable Information (PII) and how the vendors scrub the data of PII and anonymize this tyoe of alternative data from anything that could be tied back to a person. There needs to be particularly stringent controls around this in Europe as it pertains to GDPR (General Data Protection Regulation). Point of Sale (POS) alternative data does not have these concerns as the data is gathered at a company or store level and has no PII. Email receipt and merchant-level transaction data vendors make a significant effort using hashing and other techniques to anonymize the data and then also perform a level of aggregation that also helps remove PII.
Another concern for email receipt data is to ensure that the data collected by the alternative data provider is fully compliant with consumer opt-in assurances. The panel should be compliant with all opt-in terms and conditions. Card data is usually around 3-5% of merchant-level spending and it can be the most expensive data to acquire. It has significant, or material, value as a signal otherwise why pay so much for it? As such it could be considered material from an MNPI (Material Non-Public Information) perspective. However, the data is public and available to purchase by anyone, or any fund, that can afford to pay for it. It, therefore, does not fall under MNPI rules. For all three types of consumer transaction data, and all alternative data, data provenance is extremely important. If the source of the data is coming from third party alternative data providers, then compliance checks around these alternative data providers are very important. It must be ascertained that the suppliers of the alternative data provider are not providing this type of alternative data from a bad actor and from a non-public source.