When running a software as a service product, your key asset is your subscription business model and retention. While growth relies on bringing in new signups, churn can be a black hole where all your triumphs and successes evaporate.
HubSpot defines customer churn as the percentage of customers that stop using your services or paying for your subscription during a certain timeframe.
If left unchecked, software churn can cancel out all of your marketing and sales efforts on a monthly basis and lead to bad reviews. Instead of letting churn hurt your company, check out some of the ways to use data science basics to create a predictive churn model and how you can use marketing to prevent churn.
Some of the test datasets out there have a simple tag that will say “Churn” as yes or no. This will not happen in your data so we need to properly define churn for your software and create a feature called “Churn Date” as the day that the customer churned.
The reason we are using “Churn Date” rather than “Churn” is that we want to be able to use multiple time periods to identify and predict churn. If churn is not tied to a time period, we won’t be able to use fresh data like the first month to predict if a client will churn.
First, let’s discuss the two types of Churn, contractual and non-contractual.
Contractual is best described as any type of account where they have to blatantly cancel their account to stop paying. This is the easiest one to predict as you have a specific moment where someone stopped. This would be if they are cancelling their account through your application and have to confirm.
Contractual will be simple to track. Simply ensure that you are tracking when a user cancels their account inside the database so that you can accurately use that data in the future.
Non-contractual churn is much more difficult to track and predict as it is based on inactivity. We will go over how to identify a reasonable timeframe before declaring that a customer has churned.
Non-contractual companies could have users that pay as they need or simply completes a specific action. In churn prediction, this is called a “Critical Event”. This is important as we will need to find a good timeframe since a “Critical Event” has taken place in order to properly define a churn date.
A standard best practice is to take all of the critical actions and calculate the number of days since the previous critical action.
An approach that is recommended by HackerNoon’s article on defining churn is to take the average across every single customer and then plot the average days on the x-axis with the y-axis having the percentage of customers. From there, you can find an inflection point where the amount of days has less impact.
In this example from Hackernoon, the churn date would be after 7 days of not completing a critical event.
I would recommend using the last critical event as the official churn date even though it was marked in hindsight since that event’s experience may have caused the decision, conscious or unconscious to stop using your product.
Before predicting whether a customer will churn, we need to identify if it is even worth making a model off of specific demographics or feature usage.
Visualizing your data is the most important step. It can be easy to skip, but if we do not, we will create a complex model that takes days only to realize that it has no impact on churn.
We will use box plots and bar graphs to identify if a specific feature has an impact on churn.
SaaS data is not readily available for download so we will use a couple of datasets from Kaggle that are similar. Please use these graphs as examples and use your own SaaS data.
For this project, we will need Pandas, a basic machine learning library, Seaborn, an advanced visualization library, and MatPlotLib, the most basic plotting library in Python.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
The first step for you will be to download a sample dataset, or if you already work for a company where you have access to churn data, access that.
In this case, we will use a heavily used dataset for learning called the “Telco Customer Churn” dataset. You can download it on Kaggle here.
Copy the CSV that you download and include it in your PyCharm project directory. You will then load it using the Pandas “read_csv” method that will turn it into a Dataframe stored as the “telco” variable.
telco = pd.read_csv('telco.csv')
Since this Telco data is based on contractual churn, we can use it as a solid example of how to identify when most of your customers are churning. We can accomplish with a couple of key visualizations. We can either solve it with a Histogram or a Box Plot.
We can notice in this graph that a majority of churn occurs before the 30 day mark. This means that that first month sets a major precedence for whether or not a customer will stay with us. If we tried a cumulative frequency graph like Hackernoon above, we will see the same thing.
Alternatively, we can identify that majority of churn by creating a Box Plot using the seaborn library and running the following line of code.
sns.boxplot(x=telco['Churn'], y=telco['tenure'], sym="") plt.show()
We can tell with this data that the 75% mark where the color ends is around 30 days which means that 75% of our users that churn do so before the first month. When we talk about how to prevent this, one option would be to allocate resources to onboard your user in the first month and make more personal outreach like calls than emails.
We want to first identify if there is a larger trend of churn within certain demographics. Some of this can be done at the look of an eye, but we may also need to find percentages of total users as we could be deceived if certain firmographics take up the majority of our user base.
In order to identify what firmographics can be used, we should start by identifying what booleans are given to us by this data. This can be accomplished by running the following line of code:
print(telco.dtypes)
We want to look at all of these features to see if they have an impact on churn except for customerID, MonthlyCharges, and TotalCharges. Instead of writing a boxplot for all of these, I wrote a quick for loop that only plots features that have less than 10 options using the value_counts function.
for column in columns: if len(telco[column].value_counts()) < 10: sns.boxplot(x=telco['Churn'], y=telco['tenure'], sym="", hue=telco[column]) plt.show()
Let’s evaluate a couple of these box-plots and see if there’s a significant difference in who is churning and if they’re churning early.
With this boxplot, we’re asking the question, “Are dependents more likely to stay longer than non-dependents?” The answer is yes in this situation. This makes sense as they aren’t the people making the decisions. What would be a great question is “Do people with dependents stay longer?”.
Sadly, these boxplots don’t show the amount of churn that is attributed to these individuals so we will need to use bar graphs to breakdown if a large majority of churn is based on these features.
Another visualization that may be useful for software companies is to identify if certain amounts of feature usage show a likelihood of them churning. The best illustration of this is from Bowen Chen’s article on Towards Data Science where he visualizes the feature usage of music streaming services with boxplots.
What’s most interesting about these boxplots are that you can very clearly see that if a customer adds more friends on their account, they will be more likely to stay with the platform. The same goes with the amount of times that they add a song to a playlist. While these won’t give you hard numbers on when a client can be determined safe from churn, they are good indicators on what actions and features make someone have a meaningful experience with your product.
If you are interested in learning more, visit his article at Towards Data Science.
When we are talking about predicting churn in an easy way, there are two routes we can go with. We can create a Logistic Regression or a Decision Tree.
We will need to import three more libraries at the top of our script:
from sklearn.metrics import recall_score, precision_score, accuracy_score from sklearn.preprocessing import StandardScaler from sklearn.model_selection import train_test_split
Recall, precision, and accuracy scores will help us measure how effective our model is. StandardScaler will allow us to normalize our variables easily. Train test split will let us split our data up so we can create a training and test set of data.
A decision tree is an alternative model that can be used to identify certain patterns in churn that show exactly what variables will lead to churn. That being said, it’s not always as simple as a yes or no. There will be no probability associated with the prediction, but you can visualize a decision tree to show what variables have the biggest impact.
In this section, we will walk through the basics of creating both of these models and more importantly, how to test for a successful model.
The hardest part of creating this model will be preprocessing our data so that we can actually build a predictive model.
This involves removing identifying info, encoding booleans, normalizing numerical data, encoding categorical data, removing highly correlated variables, and creating a train/test set. From there, we can get started on building a model.
Most identifying data is encoded as IDs and integers so we want to make sure they aren’t used as part of the model which could completely confuse the decision tree or logistic regression. This is done by storing the IDs as we will need them later and then dropping them.
ids = telco['customerID'] telco = telco.drop(['customerID'], axis=1)
In order to predict churn, you will need to change any strings that indicate booleans into a True or False as zeroes and ones. Instead of doing this for each feature, I wrote a quick script that goes through each column and tries to substitute the strings if they exist. This should work for any dataset and is Pythonic since it uses a Try Except statement.
columns = telco.columns for column in columns: try: telco[column] = telco[column].replace({'No': 0, 'Yes': 1, 'no': 0, 'yes': 1}) except TypeError: print('TypeError') print(telco.head())
We will need to ensure that any numerical data is standardized and normalized so that any large numbers don’t overwhelm the model. We will use the StandardScaler library we imported earlier.
numerical_data = ['tenure', 'MonthlyCharges']
print(telco['tenure'])
print(telco['MonthlyCharges'])
print(telco['TotalCharges'])
telco['tenure'] = telco['tenure'].apply(lambda x: int(x) if type(x) == str else x)
telco['MonthlyCharges'] = telco['MonthlyCharges'].apply(lambda x: float(x) if type(x) == str and x != '' else x)
telco = telco.drop('TotalCharges', axis=1)
scaler = StandardScaler()
scaled_df = pd.DataFrame(scaler.fit_transform(telco[numerical_data]))
scaled_df.columns = numerical_data
print(scaled_df)
telco = telco.drop(numerical_data, axis=1)
telco = telco.merge(scaled_df, left_index=True, right_index=True)
Your standardized data should look like this when we’re finished.
We want to make sure that we aren’t using any data in our predictive model that is reiterated by a similar datapoint. This is the cause for most calculated columns and cost structures if payment depends on usage. A simple way to remove these is to create a correlation matrix.
pd.options.display.width = 0 print(telco.corr()) sns.heatmap(telco.corr(), cmap='coolwarm') plt.show()
Upon examining this heatmap, it looks as if none of these variables are heavily correlated with each other. That means we can avoid deleting any specific variables. If there were any variables close to .9 correlation, we would have had to drop them from the model to ensure we do not duplicate data.
We will also need to encode categorical data which is slightly more difficult. We don’t want to assign an arbitrary order so we can’t simply turn each value into its own number. Instead, we will turn each possible option of the value into its own boolean.
If we wanted to encode their state, we would create 50 columns, one for each state. 49 states would be false and 1 state would be marked as true. This ensures that the model can pick up on it, but it isn’t affected by us trying to make our own code on which state is which.
In the Telco case, the data points we would have to encode would be gender, Contract length, Payment Method, Multiple Lines, Internet Service, and a couple of others. We will use the “get_dummies” method which will each option of these columns into its own boolean column.
categorical_data = ['gender', 'PaymentMethod', 'Contract', 'MultipleLines', 'InternetService', 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport', 'StreamingTV', 'StreamingMovies'] for column in categorical_data: try: telco[column] = telco[column].replace({'Yes': column + '_' + 'Yes', 'No': column + '_' + 'No', 'No internet service': column + '_' + 'No internet service'}) print(telco[column]) column_dummy = pd.get_dummies(telco[column]) telco = telco.merge(column_dummy, left_index=True, right_index=True) telco = telco.drop(column, axis=1) except KeyError: print("KeyError")
This step will be the most important step in creating our model. We will need to split our data set up into two separate sections, one for training the model and one for testing the model.
Typically, your training data will be 80% of your data and your test data will be 20% of your data.
We will also need to separate the Churn flag and the variables into two separate sets.
churn = telco['Churn']
telco = telco.drop('Churn', axis=1)
X_train, X_test, y_train, y_test = train_test_split(telco, churn, test_size=.2)
Now that we have all the data we need, we can create a decision tree and visualize it to see if there are any obvious drivers of churn. A decision tree is simply a model based on a series of if statements that allow you to identify certain features that lead to churn.
The end result will look like the image before. After running our test and training data through it, we are able to see that the majority of churn occurs when tenure is under 30 days, and customers either don’t have Tech Support or they have multiple lines. You can identify additional situations by looking deeper into the leftmost Churn cases. Notice that the decision tree is a relatively easy way to see what drives churn.
def create_decision_tree(X_train, X_test, y_train, y_test): mytree = DecisionTreeClassifier(max_depth=5) treemodel = mytree.fit(X_train, y_train) tree_pred_y_train = treemodel.predict(X_train) tree_pred_y_test = treemodel.predict(X_test) exported = tree.export_graphviz(decision_tree=mytree, out_file=None, feature_names=X_train.columns, precision=1, class_names=['Not churn', 'Churn'], filled=True) graph = graphviz.Source(exported) graph.view() return [[tree_train_precision, tree_train_recall], [tree_test_precision, tree_test_recall], [test_f1_score]]
When we’re testing our prediction model, we want to make sure that it is accurate before we actually use it in production. If we do not test it, we will end up having a massive problem by predicting that users that love us will churn and users that hate us will stay.
When you test your data, there are three percentages that will show whether or not our churn model is accurate: accuracy, precision, and recall.
In most cases, your churn model will have low amounts of churn. Most SaaS companies have a churn rate between 5% to 10%. This means that you will have less churned users than non-churned users. This means that if you have only 100 cases of churn among 1,000 users, your model could predict that no user will churn and be 90% accurate.
Obviously a model like that would be worthless so we want to identify use our other two metrics: precision and recall.
Before, we discuss what precision and recall are, we need to talk about the Confusion Matrix. The Confusion Matrix is a visualization of the amount that your model predicts the right thing. There are four outcomes in a predictive model:
The code to create this more formatted confusion matrix can be found at Dennis T’s Medium article. It will also be included in the full download of this script.
For the simple way to display this graph, simply run the following code.
sns.heatmap(cf_matrix / np.sum(cf_matrix), fmt='', cmap='coolwarm') plt.title('Confusion Matrix') plt.show()
Precision is the number of True Positives divided by the Sum of all True Positives and False Positives. As there are more false positives, the percentage decreases. Precision is less important than recall in Churn, but still important. Having a low precision may cause you to take action on users that really don’t need to be saved.
Recall is the number of True Positives divided by the Sum of all True Positives and False Negatives. As there are more false negatives, the percentage of recall decreases. Recall is the most important value in churn prediction as it is the number of customers that are accurately predicted as churning.
A F1 Score allows you take all of these numbers into account to compare models with a single number.
tree_train_precision = round(precision_score(y_train, tree_pred_y_train), 4) tree_test_precision = round(precision_score(y_test, tree_pred_y_test), 4) tree_train_recall = round(recall_score(y_train, tree_pred_y_train), 4) tree_test_recall = round(recall_score(y_test, tree_pred_y_test), 4) test_f1_score = round(f1_score(y_test, tree_pred_y_test), 4) print("Accuracy: " + str(round(accuracy_score(y_test, tree_pred_y_test), 4)))
Information is only valuable when it is backed by action. The same goes for data science. We can create models and theories on whether a user is going to churn, but if we don’t track the data and use it, it’s meaningless.
The first step will be adding your data into a CRM on the customer followed by using it in your advertising, web, in-app prompts, and email marketing strategies.
Your company’s customer relationship management system can be the driver of your company if set up correctly. In fact, there are many advantages of pushing data from your software into your CRM.
In this case, we’ll talk through how we can push the prediction on churn into your CRM. We will be doing a brief HubSpot example as it is my strongest CRM; however, this can be done for any CRM with an API.
We will first have to find our API Key in HubSpot.
Next, we’ll need to create a property in HubSpot called “Churn Risk” as a property or as a probability depending on the model you use in production.
Finally, simply updating the deal to be marked as a Churn Risk. From there, you can setup triggers that notify your customer success reps.
Once of the easiest ways to solve churn is by identifying what is driving it. Often, software companies struggle to know what exact features are causing them to churn. By breaking it down by specific features, payment plans, or activities, they may be able to identify that a certain experience involved with their platform is driving most of their churn.
From there, you can alter your roadmap to fix the user’s experience and identify the impact it has on your user retention.
If you have identified that the majority of customers churn in the first 30 days, then it may make sense to make more calls in that first month and then transition them to an email cadence from then on.
This is a common strategy in software as many startups provide onboardings as part of their service. This can avoid a large amount of churn, while also making sure they don’t have to spend more money on replacing that customer.
A critical event isn’t the only thing that leads to retention. Sometimes, small actions can have an impact on engagement. These are called micro-conversions. This would be the equivalent of someone listening to a Daily Mix on Spotify when the critical event may be making their own playlist.
By identifying these small micro-conversions, you can encourage a user to take action on a small action when they are not interested in taking a big action. While it’s not the same thing as making a purchase, it may make a difference in their engagement.
A constant question asked by sales and marketing teams is whether or not a certain persona is the right fit for a business. By investigating churn, you may notice that a large amount of your churn comes from a specific vertical. After identifying this discrepancy, you may be able to ask questions on whether your company is making a profit on these customers and if these customers are not being served well by your company.
By identifying this early, not only can you reduce churn, but you could also either developer features that help these customers perform better or switch focus completely to your best customers.
Another strategy to identify customer fit is to create clusters of your best customers and see what features define a low, medium, and high-value user. You can read more about that in this article.
When your company runs on a contractual churn where a user has to click the cancel button, a cancellation funnel can be incredibly useful as it can collect data and in the best case, reduce churn.
When a customer requests to cancel their account, there may be a couple ways to solve the problem before you officially cancel. I recommend the following order:
When a user begins to cancel, ask for a wide array of reasons why they are unhappy with your service. While this may not always lead to you saving the deal, you will capture more data on why customers are churning.
From that data, you can identify what issues are costing you the most users. You can even take it a step further using clustering to identify which issues are causing high-value users to churn versus low-value users.
Here are some example reasons for churn that would be helpful to know:
Let’s say they are not using the software enough. You can redirect them to a confirmation page that has a short video from the CEO or customer success manager proposing ways they can perform. I wouldn’t recommend upselling them, but providing resources for them to use the software better may be all they needed to stay. Based on how many people view the page and decide not to cancel, you can tell how powerful your message is or if it falls flat on its face.
Before your users confirm their cancellation, offer them an option to downgrade or simply pause their subscription. This is a common tactic that companies use to reduce churn as a user may be more likely to come back if they still have an account.
If a customer does cancel, a heartfelt message from the customer success rep or the CEO can go a long way when saying that you are sad to see them go and are passionate about fixing the issues that they reported. Remember, they may come back and if they do, they’ll be much easier to reacquire than to acquire a new customer. Don’t burn the bridge just because they cancelled your service.
Once a user signs up, they don’t need to see the same homepage as a brand new prospect. The role of your website may need to change as your user’s relationship with your company matures.
You may want to engage your users with a new offering, promote a blog post, or ask them for referrals. While this may not immediately lead to reduced churn, it can increase your user’s engagement so that they are less likely to churn.
I’d recommend testing out Google Optimize, a free Homepage personalization software. Ask your developers to add a first-party cookie on the application that Google Optimize can identify users and then change what your homepage looks like.
When users are likely to churn, they may start researching your competitors. I typically do not enjoy when companies use Adwords to rank on their competitor’s keywords, but I think you can do it tastefully when it is your own users.
What you can do is setup a Google Tracking Pixel on your application and create a Google Remarketing Audience of your users. From there, simply setup ads over each keyword of your competitors for broad match and restrict the targeting to your current users.
Have fun with the ad copy while also being respectful that your users may be churning. A fun ad like “Baby come back” may make your users laugh and click your ad instead.
On your landing page, you can talk about what you have to offer that is better than that competitor or you can even take it further by providing a win-back offer.
If you have set up your HubSpot tracking code correctly, you can even send an email to that client or a call trying to recover their business and talk through their issues with your service.
If your company has a non-contractual churn model, you may want to send a reminder email to a user when they haven’t performed a critical event in your churn window. In the earlier example, a food delivery company may want to send an email or push notification with a free delivery discount to a customer who hasn’t ordered food in 7 days.
An alternate approach would be sending a likely to churn when they login to your app or website and don’t take a critical action. They may have a need for your services, but decided against it due to many reasons. In this case, it may make sense to send a short automated email on asking if they needed anything.
At the core of churn prevention is customer success and account management. The previous solutions are all marketing, but sometimes intervention is the most important part.
Setting up triggers in your CRM to notify the account manager is probably the most important strategy. It’s easy to get wrapped up into all the exciting strategies like above, but at the core of business is relationships. A simple call to the right customer before they cancel can solve a large amount of churn.
If you can’t call every single at risk customer, consider sending a text message for smaller companies or sending email messages for large enterprises. Typically a win-back campaign with a survey can be effective to ask what issues they are having and how you can help.
By sending a survey, you can learn what problems they are facing and what they would like help with before they cancel. From there, you can reply quickly to show that you’re listening to your customers.
Churn is an important part of any Software as a Service company. By using some of these tactics, you may be able to anticipate churn and prevent it before it becomes a problem in your company. There is no replacement for fantastic account management, but hopefully, by applying some of these tactics, your company can reduce churn much further than what you thought was possible.
https://hackernoon.com/defining-churn-the-right-way-5396722ddb96
https://blog.hubspot.com/service/what-is-customer-churn
https://s3.amazonaws.com/assets.datacamp.com/production/course_6342/slides/chapter1.pdf
https://towardsdatascience.com/understanding-customer-churning-with-big-data-analytics-70ce4eb17669
https://towardsdatascience.com/telco-customer-churn-prediction-72f5cbfb8964
https://medium.com/@dtuk81/confusion-matrix-visualization-fc31e3f30fea
Copyright 2021 Salestream LLC Sitemap