Alternative Data Category Description
Pricing data is a type of alternative data that contains information on the prices of various products or services. These datasets typically include information such as the vendor, product name, unit price, currency, and date. Pricing data can be sourced from various industries and markets including retail, real estate, autos, consumer packaged goods (CPG), hospitality, and travel. Pricing datasets are often used in market analysis, business decision-making, and academic research.
Aggregated pricing data of goods and services for both businesses and consumers are now more readily available than it has been in the previous decade. This type of alternative data can provide insights into corporate revenues and industry competition, while alternative measures of inflation have been developed using web-crawled pricing data.
Pricing data is widely used by investors as the data has broad macro and equity applications. Pricing data is a very useful input into the fundamental analysis of companies, particularly in consumer-related markets. Pricing data in segments like CPG, apparel, luxury, and auto sectors can give valuable insight into a company's performance. Beyond just pricing, web scraped and other datasets, can reveal valuable insight at the stock-keeping units (SKU) level, inventory, and discounting.
On a standalone basis, this data has value but may only be telling part of the story. Investors may need some indication of sales volume to accompany the data. For example, the data could be paired with web traffic or search data to get an understanding of end demand and a broader view of company revenue trends. As such pricing data is usually combined with another dataset.
Subcategory - Automotive Data
These alternative data providers allow asset managers to track changes in vehicle pricing by analyzing data sources such as classifieds, auction results, and dealer websites. By analyzing alternative pricing data, asset managers can gain a more detailed and timely understanding of market pricing dynamics. It is also possible to analyze data on financing options/promotional offers, estimate average vehicle prices by the makes and models, the count of new versus used cars, and purchases in different geographical regions.
Subcategory - CPG Data
Pricing data for the CPG industry typically includes information on the prices of products sold in retail stores and online. Point-of-sale data collected from retailers can provide insights into product prices and discounts.
Subcategory - Ecommerce Data
These datasets contain information related to the prices of products and services offered by online retailers. For example, prices of individual products, prices of similar products from competitors, and prices of products in specific categories or industries. Different data vendors cover different geographical regions, while granularity ranges from individual SKUs, or rolled up into categories, brands, or top-line retailer analysis. Markdowns and discounting is also generally available and are important. Likewise, a measure of inventory can be observed through product availability and stockouts.
E-commerce pricing data can be collected through web scraping, API integration, and manual data entry. However, it's important to note that this data can be complex and dynamic, with prices changing frequently based on factors such as demand, competition, and supply chain costs. As a result, asset managers need to monitor and analyze this data continuously to stay up to date with market trends.
Subcategory - Hospitality & Leisure Data
These types of alternative datasets provide information on room rates, occupancy rates, booking segments, and channels. Asset managers use this data to analyze average daily rates, revenue per available room, and other pricing-related metrics for the hospitality industry.
Data vendors typically provide information on ticket prices, seat availability, and fees for various services, and aggregators also compile pricing data from multiple sources and predict future prices. Asset managers use this data to trade on travel-related securities.
Subcategory - Real Estate Data
These datasets provide information on the inventory of commercial properties, buildings available for lease or sale, existing tenant information, historical trends in demographics, and occupancy and lease rates. There is also a growing amount of data available around residential markets in terms of transactions, values, inventory, developments, and rents. Real estate pricing trends in emerging markets countries can aid analysts in understanding the breadth of economic growth as well as inflationary pressures.
Data Structure
- Pricing data can be delivered in several different formats and even directly to your preferred cloud environment. Some providers can deliver it in CSV via SFTP and can offer a dashboard view via the platform. If you need the data faster, some providers have APIs set up.
- Data vendors typically track goods on the SKU level and some of the larger vendors provide data at the ticker level.
- History varies but is usually 5-10 years of data with some real estate data providers having older data.
- Delivery can be daily T+1 or T+2 but sometimes rolled up to weekly or monthly.
- Data is delivered by API, FTP, S3, and Platform/UI.
Compliance Considerations
Pricing data may be scraped from sources that are open to the public. Since the information is considered to be part of the public domain and because it is not necessary to enter a password in order to view the information, it can be argued that web crawling falls within the realm of permissible activities. However, a user of the data needs to ensure that the data vendor obtained the data legally and complied with all website T&Cs, and most importantly did not scrape anything behind a login or paywall. Any data obtained from third-party alternative data sources needs strong diligence on data provenance.
Pricing data obtained from consumer transaction data must comply with consumer privacy and personal identifiable information (PII). There needs to be particularly stringent controls around this in Europe as it pertains to GDPR (General Data Protection Regulation). POS data does not have these concerns as the data is gathered at a company or store level and has no Personally Identifiable information. Email receipt and merchant-level transaction data vendors make a significant effort by using hashing and other techniques to anonymize the data, followed by a level of aggregation that also helps remove PII.