Having changed metrics, Google Analytics now considers Unique Visitors as Users and Vists as Sessions. I came across this post in the help about Users and thought it important enough to share with you, though you can certainly find it in the help. This first was brought to my attention when I logged in today and got a message that although I was using a filter (to factor out internal traffic to sites such as myself or others in marketing and sales) I might not be getting the full picture. The link gave me the following, which really does not connect between the message and the content, but hey its good stuff to know anyway. Dig in…
At a glance
The Users metric in Google Analytics (GA) can help you find out how many users viewed or interacted with your content within a specific date range.
GA uses two different techniques for counting these metrics for different kinds of report requests, so you can get the specific data you need in each report quickly. But because there are two different ways to calculate Users, there can be small discrepancies between the reported number of these metrics in different reports.
This article explains why and when GA uses each calculation technique.
An in depth look
Background
To be able to quickly serve data to your reports, Analytics creates a set of unsampled, pre-aggregated data tables, which are updated on a daily basis. (For more information on how this works, see how sampling works in GA.) The pre-aggregated data tables are well equipped to handle common reporting requests, including changes to the date range in standard reports. For example, when you request a report, GA looks up each metric in the pre-aggregated data tables and serves that up to your reports. If you adjust the date range from August 1 – August 31 to August 1 – September 1, GA looks up each metric in the September 1 pre-aggregated data table and adds the new data to the existing total.
Although this works well for most metrics, it doesn’t work well for Users. Some metrics, like Pageviews or Screenviews, are simple additive counts over days, but Users are based on more-complicated calculations. Instead of just adding (or subtracting) processed data from the pre-aggregate tables, GA needs to recalculate these metrics for each different possible date range that you select in a report. For example, a user could open a website on August 31 and on September 1, but GA recognizes this user as just one User over the course of these two days. In your reports, if you change the date range from August 1 – August 31 to August 1 – September 1, GA can’t simply add the difference to the value of Users you see in your reports because this number is based on a complicated calculation, and not just added to the running total in the pre-aggregated data tables. Instead, the metric has to be calculated on the fly each time you request it in your reports.
As a solution to this challenge, GA has developed two different techniques for calculating Users. When you select Users in a report, GA will automatically select the optimal calculation to use depending on the report being viewed.
The following sections explain how GA uses these techniques to provide a fast and accurate User count in your reports.
Calculation 1: Pre-calculated data
To quickly display the Users metric, GA uses a calculation technique that relies only on the number of sessions in the given date range and the time of each session. (This is determined by technology managed on the device, like a web browser, and is often referred to as the client-side time.) Because the result of this calculation can be added to the pre-aggregated data tables, GA can reference the table to quickly retrieve and serve up this data in a report, including when you change the date range.
Calculation #1 is used exclusively in reports when the only dimension is a time frame, like the Date, Week of Year, or Month of Year. This means that you only see it in the Audience Overview report when no Segments are applied, or in a custom report where one of these date dimensions is the only applied dimension. When viewing Users over any non-date dimension, GA uses a second table, described below, in order to calculate Users on the fly.
Although this technique can quickly deliver unsampled data, it does have some disadvantages. This calculation relies on number of sessions and client-side time, so if a user’s client-side time is incorrect, or if you are seeing a view that filters out some sessions from a user (instead of all users), the data might be inconsistent.
In order to get around any potential inaccuracies, you can create a custom report with a non-date dimension that will be the same across sessions for users (e.g., Browser, Operating System, or Mobile Device). This forces GA to use Calculation #2, instead.
Calculation 2: Data calculated on the fly
Calculation 2 is based on the way you assign, collect, and store persistent data about your traffic. There are many solutions you can implement to customize this, but the most common way this data is going to be assigned and stored is through cookies managed via a web browser.
Calculation #2 requires heavy computation over large data sets, so it always references data in the raw session tables and not the pre-aggregate tables. Calculation #2 takes more time than Calculation #1 to process and serve data up for Users to your reports because the values are calculated on the fly – GA can’t just look up and deliver data that’s already been processed and stored in the pre-aggregate tables, as in Calculation #1. Instead, the calculation happens each time you make a request for it. Note that if certain conditions are met, this may induce sampling, but GA Premium account users can access unsampled reports.
Calculation #2 is used in custom reports and allows for the calculation of Users over any dimension, like Browser, City, or Source.
Note that for some dimensions, like Source or Medium, it’s possible that the same unique user can be in multiple buckets (like if someone visited from organic search and paid search in the same date range). For this reason, when viewing User over such a dimension, the sum of the rows should not add up to the total.
Recent Comments