Product Catalog Analysis

Updated: Apr 7, 2021

Part of Datacop's Blog Series on Data Science in the Digital Economy (#1) The Product Catalog Analysis is a report, e-commerce analysts use to analyse the performance of their products on their e-shops. In this blog, we will discuss how e-commerce analysts evaluate products' performance on their e-shop. Find out how to identify which products are costing you money. Learn about a systematic method to discover "organically trending" products among your 100s or 1000s of products on-site that could be your future cash drivers. This blog will also detail the data requirements; how to interpret your product catalog based on an example set of data, of a UK-based fashion e-shop.

Level 1: Product Catalog Analysis

Data Requirement: Easy purchase_product | attributes: name_ID, quantity, price, cost

At its simplest, analysing the performance of your products is as simple as sorting all your products in a table by their total number of transactions (Figure 1) The dataset used in this article is the top 20 items sold of a male-oriented fashion shoe e-shop, with their original names anonymised. Once ordered, the top-performing products can be identified and clearly see what are the “winner” products: A, B, C. This is helpful, even for smaller e-shops to understand which products are selling and which products to prioritise in future marketing, merchandising and inventory decisions. Here you can find an example of a Level 1 Report in Shopify.



Figure 1: Level 1, Top 20 products sold in a weekly period for a UK fashion brand.


However, at this level. What if the top-selling products had also the benefit of being viewed a lot more by customers? Perhaps the product was featured in your social media channels? How does exposure of the product to your customers affect their sales? When we compare Product K and L with 34 and 33 orders respectively, is it really a comparative performance or did one of the Products achieve their orders with much more/less exposure?

Level 2: Product Catalog Analysis

Data Requirements Medium purchase_product | attributes: name_ID, quantity, price, cost, timestamp product_detail_view | attributes: name_ID, # of views per product, timestamp To take the Catalog Analysis a level step higher to overcome some of the limitations of level 1, we must combine the orders dataset we have with the product’s number of views on the site, within the same time period. The reasons view_items is a useful metric to combine transactions with on a product level are that it is a widely recognised proxy for shopper interest in a product and that view_items in each sessions are view_items are relatively rare.


Now consider the original analysis with the combined view of # of views within the same time period as the orders. (Figure 2) This view recontextualizes our original analysis. For example, the original “winner” was Product A, while now we see that Product F would make a more appropriate pick for a “winner”. It has the best % conversion rate. We can see that Product A, while being the top selling product, actually also got the highest number of views and has a % conversion rate lower than the average (cca 3.5%). Figure 2: Level 2, Top 20 products sold in a weekly period for a UK fashion brand, combined with their number of product detail views in that same week. Ordered by # of orders.


This insight reveals suboptimal investment of exposure into their products and that a better management of exposure can yield revenue boosts. Consider if Product F, with three times the CR% rate had the kind of exposure Product A had, the e-shop would have sold far more products within the time period. If Product F had 3867 view_items at the same conversion at 7.61% conversion rate, it would have sold 294 products which is 185 products more than Product A sold at the same rate of exposure. When we have implemented such “product placement switching” we have noticed two observable effects: Effect 1. Product A will have degraded exposure on site, its number of orders in the same time-frame will fall. Effect 2. Product F will have increased exposure on site, its number of orders in the same time will increase. So as long as Product F’s conversion rate % does not drop below the average of all the products, the decision was a good one and will yield a boost in revenue. Products B,C,D coloured in yellow are the products that form the “successful baseline”. Level 2 of the product catalog analysis will not only reveal suboptimal placements of products, but also products that have above average exposure (cca. 1350 viewed_items) and they hold an above average conversion % rate. Considering that these products already have high exposure and are selling well. Since they are bringing in a lot of revenue their exposure level does not need to be changed. An e-commerce shop that maximises the potential of their e-shop through the application of this analysis will typically have more yellows than reds, within their top 20 and top 70 products by # of view_items. Products E,G,O coloured in red are the products that are "costing us money". These products have had above average exposure and conversion % rates below the average. These products would require additional exploration. Where they pushed by a particular channel that may have caused a dip in conversion? (mentioned on Social Media, included in a -20% Sale Email, etc.) These products need to be seriously reconsidered as products given less exposure. If this type of product is featured on the homepage, should it be replaced? Product F,H,M coloured in green are “organically trending”. These products have had below average exposure and top 3 highest conversion % rates. They represent the products organically selected by your customers as most effective at converting. Typically, within the studied time frame, they were not pushed in any particular way to increase their exposure. This is why they are considered as organically trending. These products merit serious consideration as first picks for future merchandising activities. Double check if they were promoted in some way, perhaps it is also a signal of well executed merchandising push. Most e-commerce firms can perform level 2 of the catalog analysis with the help of more advanced, but still common software on the market like with Google Analytics’s enhanced ecommerce reporting and BigCommerce proprietary reports There are two main limitations at this level of the analysis to consider when intepreting your results: 1. First is the price variability of products. What if the products with a high conversion rate are of significantly lower price? We know that prices affect customers' decision to purchase, a high-priced laptop will have a lower CR% “naturally”, in comparison to a low-priced keyboard. So it is important not to compare apples with oranges. In the example above, the price range ranges between €90 - €110 and all products are male shoes so the analysis is sound as we are comparing apples with apples. However many e-commerce companies have a much higher variability in prices. Consider a fashion retailer selling both socks at below €10 and bags above €50; or a furniture retailer selling both pillows at less than €50 and Bookshelves at more than €150. 2. Second major limitation is the lack of additionally useful product “tags” that can be usually found in the catalog. Once e-commerce companies start having more than one distinct category of products, more than one brand, vendor, colour; the product catalog analysis can create another set of insights providing comparative views within segments of products as well as insights into the performance of segments of products as a whole.

Level 3: Product Catalog Analysis

Data Requirements Hard purchase_product | attributes: name_ID, quantity, price, cost, timestamp product_detail_view | attributes: name_ID, # of views per product, timestamp products_catalog | attributes: name_ID, margin (category, colour, brand, vendor) To take the Catalog Analysis yet a level step higher to overcome some of the limitations of level 2, we must enrich the dataset by the product’s gross_profit value. From that we can calculate the Profit Per View = gross_profit * % conversion ("PPV") and Total Gross Profit= gross_profit * # of orders. PPV enables us to see how much profit each additional view would bring to the e-shop. For example in Figure 3 Product A with a PPV value of €1.41 would have generated an extra €1,410 if it was viewed an extra 1000 times. This is theoretical for the week retrospective, however it provides a useful guide for an e-commerce director to make decisions about which products to prioritise going forwards.

Figure 3: Level 3, Top 20 products sold in a weekly period for a UK fashion brand, combined with their number of product detail views in that same week and gross_profit. Ordered by # of orders. PPV and Total Gross Profit are calculated

Now consider this enriched dataset. We can see it above in Figure 3, the products retaining the colour scheme of the findings from Level 2 (Figure 3) The new variables, particularly PPV recontextualise our findings from our level 2 analysis. Consider our green “organically trending”. While in Level 2, we would have thought the best performing product is Product F, then Product H then Product M. However, if we consider the PPV of these products as well, we can see that Product M is the one with would have added most profit per each new view_item. If an e-commerce director were to then apply all the findings of the Product Catalog Analysis into practice and model what would happen if we were to replace the three best green “organically trending” products for the "costing us money" products Product M, 683 view_items - > for Product A, 3867 view_items Product H, 680 view_items - > for Product E, 3282 view_items Product F, 415 view_items - > for Product J, 1985 view_items

Given the same week and assuming the same conversion rates on the products the extra profit from these modelled “switches” would yield €15,654 for the e-shop, a 42% increase from the total gross profit of the top 20 performing products of this e-shop’s example. If we have used the insight only on level 2, we would have prioritised the winners differently. Given the same time period, yielding lower benefits.

Figure 4: Level 3, Model the Top 20 products for a UK fashion brand, if top 3 organically trending and most profitable products replaced the top 3 products costing us money.


Would you like to find which products are organically trending in your catalog? Would you like to find out which products are perhaps over-exposed? If you would you like to maximise the potential of your product catalog on your e-shop and marketing channels Please daniar@datacop.services or lukas@datacop.services.

135 views0 comments

Recent Posts

See All

"How many products does an e-commerce shopper view in a typical session?" A 6-month long study of 1.9M sessions of an e-commerce brand...