Segmentation is the art understanding your customer-base so that you may better serve them. It comes from an understanding that customers are diverse, and to serve a large market effectively, you must understand how customers differ —and which differences are important.
One thing our clients often want to understand about their customer-base is the customer journey: all the phases and key points of a customer’s lifecycle of interactions with the brand.
For example, some customers may start in-store and move exclusively online; others may only shop on mobile; still others may be truly omni-channel and buy online when they’re at work, in-store when they’re in the area, and on mobile when they’re on their commute. Some may become more loyal after redeeming a discount; others less so—and so on. But if you have thousands (or millions) of customers, how can you understand the behavior patterns across your customer base?
How can we do this? Sequence analysis—which, at its core, is analyzing a large set of sequences (one from each customer) and drawing meaningful statements about which sequences are similar or different from each other.
Below, we walk through a proof-of-concept exercise we recently performed for a very large multichannel brand. The data is real transactional data (although we used an extract of 30k customers)
The data we looked at it is formatted as follows. There is a column with the customerID, sale channel (regular, online, mobile), day of the order (counting from the first order observed in the whole orders database), and the order number for the customer. This is real data for one customer in the dataset:
|ID||Transaction Type||Order Date||Transaction Number|
Using just these few variables , it is possible to use statistical techniques to determine clusters of similar customers.
We do this by considering—for each customer’s sequence of transactions —the interpurchase times (how many days passed between purchases), the purchase , and the number of total transactions.
Here’s an example of summary statistics of order counts and the percentage of customers and orders that fall into each segment:
|Segment||Number of Customers||Number of Orders||Percentage of Customers||Percentage of Orders|
Now we can start to diagnose how our segments are different. We start with a few summary plots and tables to identify high level differences. The first plot (Figure 1) simply shows the percentage of each segments’ transactions that belong to each channel. Figure 2 shows the distribution of the number of purchases by customers in each segment. For example, everyone in Segment 5 has made just one purchase, but many people in Segment 8 have made fourteen or more.
From these summary outputs we can make the following l observations:
- There are two segments of one-and-done buyers (Segment 5 and Segment 10): with Segment 5 making a single in-store purchase and Segment 10 making a single online purchase. There are no mobile one-and-done buyers.
- Segments 1 and 2 have similar purchase frequencies but Segment 1is many times more likely to make purchases online or on mobile relative to Segment 2
- Segments 2 and 3 have a similar composition of orders (by channel) but Segment 3 purchases much more frequently
- Segment 4 customers do most of their shopping online—but as is obvious, they purchase through other channels frequently as well. They are also heavier purchasers than most segments on average
- Segments 6, 7, and 8 are similar in their purchase channel composition, but going from 6 to 7 and then to 8, customers purchase more frequently and spread their purchases out over more channels (in addition to in-store)
- Segment 9 is the mobile segment. 100% of these customers have made at least one mobile transaction (as we’ll see below), and mobile accounts for almost 60% of their purchases
Unpacking each segment’s lifecycle
We also investigate how purchase behavior evolves over a customer’s lifecycle within each segment. The graphs below show the percentage of customers in each segment that have made a purchase in a given channel by a certain date. For example, if the line is at 50% on day 100 for “mobile”, it means that 50% of customers in that segment have made a mobile purchase by the 100th day after their first purchase. The differences in these plots can tell us how different segments’ relationship with the brand evolves over time. . We can illustrate this by looking at detailed plots for Segments 1 and 4.
In Segment 1, everyone begins with an in-store purchase and most continue to only purchase in-store. Throughout their lifetimes, only about 10% of this segment has ever made an online transaction, and about 5% have ever made a mobile transaction.
In Segment 4, about 90% of customers begin with an online transaction and another 10% begin with an in-store transaction. 100% of these customers have made an online transaction, but only about 50% have ever made an in-store purchase. About 25% of them have ever made a mobile purchase (although they are slow to do so).
What else can we investigate?
This just scratches the surface of the type of analysis that could be done analyzing sequences of transactions. Specifically, we could:
- Include brands or categories of products along with purchase channel in the sequence data
- Include basket size
- Reverse the sequence and look at common patterns of behavior before a customer stops purchasing
- Focus in on particular demographic segments
- Label each customer with their most likely purchasing segment, which would allow us to cross-analyze this information with other customer data, such as demographics—which would allow an analyst to answer questions such as whether or not customers in a particular region or income bracket are more likely to move to mobile shopping