The below script shows how we may handle RFM segmentation with Python. RFM stands for Recency, Frequency and Monetary:

  • Recency: How many days since they last purchased
  • Frequency: Total purchase count by customer
  • Monetary: Total spend by customer

This method enables us to group our customers & target specific groups with advertisements or deals. The way we approach the task is:

  • Calculate the quartiles for recency, frequency and monetary
  • Segment those into quartiles, where quartile 2 and 3 are joined to become a 'moderately engaged' quartile
  • Create an RFM score from the quartiles.

So, if my recency is 1, frequency is 2 and monetary is 1, my score will be 121. Using the scores, bucket the customers into specific groups

In [ ]:
import pandas as pd
path = 'Desktop/customer_features.csv'
df = pd.read_csv(path, delimiter=',', header='infer')

#Calculate the percentile for recency, frequency and monetary

df['recencypct'] = df.recency.rank(pct=True)
df['freqencypct'] = df.frequency.rank(pct=True)
df['monencypct'] = df.monetary.rank(pct=True)

#bucket recency into quartiles
df.loc[df.recencypct <= 0.25, 'recencyqtl'] = 1
df.loc[df.recencypct > 0.75, 'recencyqtl'] = 3
df.loc[(df.recencypct > 0.25) & (df.recencypct <=0.75), 'recencyqtl'] = 2

#bucket frequency into quartiles
df.loc[df.freqencypct <= 0.25, 'frequencyqtl'] = 1
df.loc[df.freqencypct > 0.75, 'frequencyqtl'] = 3
df.loc[(df.freqencypct > 0.25) & (df.freqencypct <=0.75), 'frequencyqtl'] = 2

#bucket monetary into quartiles
df.loc[df.monencypct <= 0.25, 'monetaryqtl'] = 1
df.loc[df.monencypct > 0.75, 'monetaryqtl'] = 3
df.loc[(df.monencypct > 0.25) & (df.monencypct <=0.75), 'monetaryqtl'] = 2

#slimline dataframe and drop decimal points
df2 = df[['id', 'recencyqtl', 'monetaryqtl', 'frequencyqtl']].astype(int)

#calculate the recency, frequency, monetary score (RFM) and then count the number of customers in each group
df2['score'] = df2["recencyqtl"].map(str) + df2["frequencyqtl"].map(str) + df2["monetaryqtl"].map(str)
df3 = df2[['id', 'score']].groupby(['score']).agg(['count'])

#We have a possible 720 groups if we have 3 number scores. So, we need to reduce this
#if frequency and recency are low then we will say they're disengaged customers

df2.loc[(df2.recencyqtl == 1) & (df2.frequencyqtl == 1)& (df2.monetaryqtl == 1), 'Segment'] = 'Disengaged'
df2.loc[(df2.recencyqtl == 1) & (df2.frequencyqtl ==1)& (df2.monetaryqtl == 2), 'Segment'] = 'Disengaged Moderate Spenders'
df2.loc[(df2.recencyqtl == 1) & (df2.frequencyqtl ==2)& (df2.monetaryqtl == 1), 'Segment'] = 'Moderately Engaged Low Spenders'
df2.loc[(df2.recencyqtl == 1) & (df2.frequencyqtl ==2)& (df2.monetaryqtl == 2), 'Segment'] = 'Moderately Engaged Moderate Spenders'
df2.loc[(df2.recencyqtl == 2) & (df2.frequencyqtl ==1)& (df2.monetaryqtl == 1), 'Segment'] = 'Recent Low Spenders'
df2.loc[(df2.recencyqtl == 2) & (df2.frequencyqtl ==1)& (df2.monetaryqtl == 2), 'Segment'] = 'Moderately Engaged'
df2.loc[(df2.recencyqtl == 2) & (df2.frequencyqtl ==2)& (df2.monetaryqtl == 1), 'Segment'] = 'Moderately Engaged'
df2.loc[(df2.recencyqtl == 2) & (df2.frequencyqtl ==2)& (df2.monetaryqtl == 2), 'Segment'] = 'Moderately Engaged'
df2.loc[(df2.recencyqtl == 3) | (df2.frequencyqtl ==3) | (df2.monetaryqtl == 3), 'Segment'] = 'Engaged'
df2.loc[(df2.Segment.isnull()), 'Segment'] = 'Other Segment'

#Now, let's count how many customers are in each of our new segments.
df4 = df2[['id', 'Segment']].groupby(['Segment']).agg(['count'])