You are a SaaS business and you have created the perfect customer acquisition strategy. You execute it flawlessly and people start flowing to your landing pages. Defining and keeping track of your metrics is key to your success and so you start measuring visits, conversion and activation rates. If the numbers are keep growing you are happy, right?
But if you stick only to the user acquisition related KPIs of the customer life cycle, you will end up with an incomplete and in many cases, erroneous view of the behavior of your customers.
Fear not, it is quite easy to add more value to your metrics. You only need to complement them with the KPI of Retention. Although it is simple in concept, it is often overlooked especially by first time founders. In this post we will describe what retention is and a few different versions of it. We will also see how easy it is to calculate it by using raw data from Intercom that we may load into a database using Blendo.
We could agree more, customer retention is a simple yet powerful metric.
Customer retention refers to the ability of a company or product to retain its customers over some specified period. (click to Tweet)
They key word here is “retain”, as it might have a different meaning depending on your business. For example, Retention, for a SaaS business with a freemium model, might be linked to how many times the user is logging on the platform.
As it happens with the majority of the simple but important concepts, there are many variations of customer retention. Let’s see a few of them.
- Full Retention. The proportion of customers who repeat a specific event, every single day, from a specific period.
- Classic Retention. How many customers are repeating a specific event exactly N days after a reference date (selected by us).
- Rolling Retention. The proportion of customers who are repeating a specific event after a specific period and any day after this period.
So for example, let's consider we want to calculate the Rolling Retention with N=28 starting from today. We consider a customer retained if she has performed an event 28 days from today and any day after that.
There are more Retention types, e.g. Return Retention or Bracket-dependent Return Retention. It’s up to you and on your understanding of your business to select the right one. Also, no one restricts you from calculating more than one. It might not be a bad idea at all, to calculate more than one types and compare them. For a more thorough description of all the above types of Customer Retention you can check this article here from Thomas Sommer.
It is important to keep in mind how Retention fits in your business and product. For example:
Which event you should choose to calculate it?
Is the login event enough?
What period makes sense for you?
No matter how well described the concept of Retention might be in any blog post (like this one), at the end you will have to put it into the right context to make it valuable.
Customers behavior tracking using Intercom
Intercom is a great tool for user engagement. At its core it allows anyone in a company, to engage and communicate with customers as interact with the product or any other touchpoint, e.g. a blog page. To deliver this kind of functionality, Intercom has to track events that users generate. The owner of the products defines these events. Moreover, we can use the same events to track the retention of customers. All it takes is to identify which event we would like to track and figure out a way to calculate the retention on this data.
To keep things simple, we will consider the following two properties that Intercom keeps for users:
- signed_up_at: Which is the date and time the user appeared (signed up) to the product for the first time.
- last_request_at: Which is the date and time the last event of a user was recorded by Intercom.
- user_id: because you know, we need to track our customers somehow.
We will use the
last_request_at for obvious reasons. It is a property Intercom tracks by default and accessible to everyone, while it represents some kind of interaction of the customer with the product. If it is preferred to track a specific event, the mechanism of Webhooks should be used. Read more information on how to use Webhooks with Intercom here.
Up until now we:
- Know when a customer signed up for the first time
- When the last event was triggered
- We can start calculating different versions of Customer Retention
...But we need to get access to this data first...
Once we get our data was in one place -- raw, updated, and synced -- analyzing it and calculating the Customer Retention with SQL becomes much more flexible.
At Blendo, we’ve built a platform for easily collecting and consolidating all your data in a data warehouse like Amazon Redshift or a database like PosgreSQL.
You just need to hook up an Intercom account together with the parameters of a database connection and all your data will be available in just a few minutes for analysis.
If you want more information on how to sync your Intercom account with a database, check here.
After a successful sync, a table named
users will be available on the database, with the fields mentioned earlier available for every user that Intercom is aware of.
How to calculate the Customer Retention
Let’s see how to calculate the Classic Retention. To be exact, we’d like to know the number of customers who signed up each month and who returned back to the product the next month. This is the One month Classic Retention.
The table we have from Intercom contains a lot more information than the 3 fields we need. It's a good idea to do some “data preparation” and create some more basic views containing only the necessary information.
The first view will contain the information of when each customer has signed up. We will create it using the following query:
select user_id, date_trunc('month',signed_up_at) as sign_up_month from intercom_users
What is important to note from this simple query, is the usage of the
As people can sign up at any date we are interested to “round up” this information to each month and this is exactly what this function does. At the end only the information of the month that the user signed up will be retained.
The second preparatory query is the next one:
select user_id, date_trunc('month', last_request_at) as last_user_event from intercom_users
This view will contain the
user_id together with the date when the last event for each customer was recorded, again we are using the
date_trunc function as we care only for the month to calculate the customer retention.
To calculate the customer retention using the results of the previous two queries, we need another one. That SQL query may look like the following:
select sign_up_month, count(distinct users.user_id) as signed_up_users, count(distinct events.user_id) as logged_in_users, case when count(distinct users.user_id) > 0 then count(distinct events.user_id) * 100 / count(distinct users.user_id) else 0 end as retention_pct FROM users LEFT JOIN events ON users.user_id = events.user_id AND last_user_event >= users.sign_up_month + interval '1 month' AND last_user_event < users.sign_up_month + interval '2 months' WHERE users.sign_up_month >= '2016-01-01' AND users.sign_up_month < '2016-05-31' GROUP BY 1 ORDER BY 1;
This query joins the sign up and event queries results and performs a few counts to come up with the customer retention.
A few important notes on this SQL query:
- There may be dates without sign ups. We need to use a conditional count to avoid divisions by zero.
- The retention is calculated for a one-month period. You can change this easily by changing the value at the second
“AND”clause of the
- The Customer Retention is calculated for signups that happened in each month from 2016-01-01 until 2016-05-31. Again you can change this to your preference.
The complete SQL query is the following:
with users as (select user_id, date_trunc('month',signed_up_at) as sign_up_month from intercom_users), events as ( select user_id, date_trunc('month', last_request_at) as last_user_event from intercom_users) select sign_up_month, count(distinct users.user_id) as signed_up_users, count(distinct events.user_id) as logged_in_users, case when count(distinct users.user_id) > 0 then count(distinct events.user_id) * 100 / count(distinct users.user_id) else 0 end as retention_pct FROM users LEFT JOIN events ON users.user_id = events.user_id AND last_user_event >= users.sign_up_month + interval '1 month' AND last_user_event < users.sign_up_month + interval '2 months' WHERE users.sign_up_month >= '2016-01-01' AND users.sign_up_month < '2016-05-31' GROUP BY 1 ORDER BY 1;
A sample of the results should look like this:
Interpreting the results is quite straight forward. For example, for 2016-01-01 we have 735 signups. Out of these, only 66 customers had any kind of activity the next month, this corresponds to a customer retention of 8%.
With this example is becomes quite obvious why retention is important to be used together with the customer acquisition metrics.
It doesn’t matter how many customers we attract if we don’t manage to retain them and of course convert them into paying customers. (click to Tweet)
To be continued…
Today, we scratched only the surface of how to calculate and track customer retention. In future posts we will see how to use Cohort Analysis to better understand how retention behaves in time. Or how to link customer retention with other important metrics, with the most important one being the LTV of customers. To do that we will need to start combining data coming from multiple sources. And this is where Blendo shines, so do not forget to check it out.
We encourage you to start syncing your data into your data warehouse. Keep track of every KPI that is important to your business and why not, use the above queries with your data.