Converting Free to Paid Subscribers for an Online Marketplace using Unsupervised Learning

Converting Free to Paid Subscribers for an Online Marketplace using Unsupervised Learning

Client’s Challenge

➢ Most online businesses and marketplace face the issue of very high number of free subscribers and only a fraction of paid subscribers.
➢ Limited success in converting free supplier to paid supplier. Even after running a campaign to reach out to 100K free suppliers, only 2-3 % leads to paid conversion.
➢ Challenge of maintaining a healthy mix of suppliers in high demand categories without they churning out frequently.
➢ Given the highly complex product and location hierarchy, company could not find the solution with their existing Business Intelligence (BI) solutions.
➢ Company were looking for Retail and eCommerce solution, which offers a comprehensive analysis of their data to find most promising product-location segments to address.

Analysis & Solution Approach

➢The first task was to understand the customer problem in detail and to break it down into subproblems. The problem was divided into two specific phases: ➢Phase I: ➢ To identify clusters of product-location tuples which are attractive (high-demand, high-supply, low churn) and non-attractive (low-demand, low-supply, high churn).
➢ These segments would allow the customer to rechannelize their focus on specific suppliers who belonged to attractive segments and to defocus from non-attractive segments.
➢Phase II: ➢ To predict the chance of conversion of a free subscriber to paid based on which segment they belonged to

Step 1: Data inputs related to demand, supply and churn were captured from various internal systems. These were then aggregated at Product-Location level to create the data to be used for clustering analysis

Step 2: Given the diverse nature of products, locations and associated business seen on the portal, it was important to make data comparable across different product-locations. To achieve this, RFM (Recency-Frequency-Monetary Value) measures were generated from the base data.

Step 3: Once Product-Location level RFM data was generated, it was now ready to be consumed by Machine Learning models.

K-Means Clustering model was used to find clusters of Product-Location tuples. To find the ideal number of clusters, with-in cluster sum of squares (WCSS) was plotted to get an elbow graph. The elbow graph suggested that data can ideally be divided into 7 clusters.

Further analysis of the 7 clusters revealed that 2 of them were attractive segments, another 2 were non-attractive segments and remaining 3 were mid-segments.

Step 4: Various Supervised Learning models were applied on feature rich supplier data to predict chances of converting them from free to paid. The models were validated on test and blinded data to look for model accuracy.

Benefits Delivered:

✓ AI driven solution has led to 73% prediction accuracy.
✓ The biggest positive was that the customer has stopped its selling activities in the non-attractive segments. This has led to cost savings from wasteful sales activities.
✓ Another major impact has been significant increase in its sales activities in the attractive segments leading to significant increase in online sales.
✓ The third impact has been their use of predictive models, which the customer now uses to ascertain the chance of free supplier becoming paid even before the supplier is approached.


It’s simple.