It was seven years ago that I first met Google Analytics. I was looking for a partner, someone who'd stand by my side through the ups and downs of running a web business. When I met Analytics, she seemed perfect.
Although things between us started well, our early relationship was not without its hiccups: Sometimes Analytics confused me. Sometimes I suspected her of lying to me. And sometimes she would urgently try to tell me something important, but I wouldn't pay attention.
Over the years, we ironed out our differences, and now we enjoy a stable, mature relationship. If you find your relationship with Analytics is on the rocks, this loving guide will light the way forward.
- Why Bother with Analytics?
- Workflow Tips
- Connect to Other Google Products
- Location of Snippet in HTML
- Cross-Browser and Cross-Device Tracking (User ID)
- Subdomain Blues
- Cleanup Noise, Chaff, and Distortion
- Analytics and Social Media
- Illuminating Analytics Blind Spots
- Collect Your On-Site Search Queries
- Measuring Success
- Precautions
- Be Disciplined in Making Annotations
- Send Alerts
- Analytics Blocking {:.in-page-navigation}
Why Bother with Analytics?
Guides to website ownership seemingly all recommend that you install Google Analytics. Why the universal recommendation? What will you learn from tracking how your visitors use your website, and how will this information benefit you?
Determining the success of all your different traffic acquisition channels. Which channel brought you more traffic—social media vs email vs paid advertising vs organic vs partner websites? And how well did traffic from each of these sources convert compared to the others?
Determining how various marketing campaigns performed within each channel. You’ll be able to see that your Halloween Treats advertising campaign generated more sales than your Autumn Sale campaign.
Providing a problem detection system for uncovering issues with your business. Assuming that you have familiarised yourself with the normal range of values for a few highly informative metrics (such as “unique visits per week” or “revenue per week”[^3-1-1]), then, just by glancing at these figures every few days, you can assure yourself that your business is healthy. But if you notice that one of these figures is way off, that most likely means that you’ve got a serious problem somewhere which you ought to investigate…
Learning about how your business goals (e.g., revenue, leads, sign-ups, etc.) are correlated with other attributes (e.g., the visitor’s country, the product categories they browsed, whether they are visiting from a mobile or desktop device, which website they arrived from). This information helps you focus future efforts.
Understanding your website’s seasonality. Through graphical analysis of past traffic and revenue, you will be able to discern seasonal patterns and be better able to project and anticipate future seasonal variation.
Determining the most successful landing pages. When you know which ones were effective, you can study them with the goal of creating more of these kinds of successes in the future.
Measuring which browsers and devices were most commonly used by your visitors. This enables you to decide the relative priorities to assign each platform in your graphic design and your software testing. Optimising for desktop is wasteful if 98% of your traffic comes from mobile.
Improving the effectiveness of Google AdWords advertising. With a bit of configuration, you can see how much profit you are earning (on average) for each click on a given paid keyword. This, in turn, will inform your bidding in Google AdWords, ensuring that you are always profitable in your advertising.
Gathering demographic data, such as what countries and cities your customers are based in. This is especially relevant when geography influences what products you stock or what languages you should present your website in.
Measuring the snappiness of your website. Identifying slow pages can give you some “quick” gains in SEO and conversion optimisation.
Discovering which pages of your website are most engaging, as evidenced by page view counts, repeat visitor counts, or average times on page. This knowledge could inform website redesigns, for example by prompting you to link to these beloved pages from the home page.
Alerting you of web pages that are accidentally publicly available. For example, if some part of your website was supposed to be behind an admin-only password wall, then a Google Analytics report that says hundreds of people saw that content indicates that you’ve got a security bug.
Knowing what Google Search queries your website appears for, along with your website’s rank in the results page and the click-through rate your entry achieves.[^3-1-2]
Learning what your visitors search for within your own website (via your on-site search box). This could prompt you to start stocking products which you don’t yet sell, but which your customers are already looking for.
Pinpointing the stage within your funnels where the most leakage occurs, and thereby empowering you to focus conversion optimisation efforts on these problem points. For example, do a higher percentage of people drop out at the payment stage or the registration stage?
Identifying which pages have high exit rates or high bounce rates, either of which would indicate that a page is scaring away traffic and needs some work.
Stroking of ego. Seeing tens of daily websites hits grow to hundreds and then to thousands is gratifying—and this indulgence is a healthy one, thanks to its motivational effect.
Workflow Tips
The sanity-preserving Analytics workflow: Form plain English sentences backed by hard data
Google Analytics is daunting: There are so many report screens with so many variables and tiny little numbers that it’s all too easy to drown in data. But in spite of this overwhelming feeling, it is nevertheless still worth wading in and exploring. Indeed, some of the best insights can come from wandering around the data without a plan.
There is a nice trick to help you regain a feeling of control during these exploratory sessions: Approach each session with the express goal of forming a few sentences about your business, as if you were going to present some curious facts and observations to your friends over lunch. This means you want to end up with sentences like “Fifteen percent of our web visitors come from Germany, yet none of them end up buying our products” or “Our pricing web page has double the amount of clicks as any other page we display in our global navigation” or “MS Windows phone users are four times as likely as users of other platforms to leave the website within five seconds of arriving”. When data is condensed into English like this, it becomes more memorable and far more actionable.
Navigate Analytics by searching
It’s always a pain in the ass to find a specific report in Google Analytics. Often the one you need lies nested—or doubled nested—somewhere deep in the left-hand navigation panel.
Instead of straining to find a report by trial-and-error browsing, it’s much easier to search for it in the upper-left-side search bar titled “Find Reports & More”.
This search works like magic, and it will save you a whole lot of time.
Connect to Other Google Products
AdWords
Background: I believe that anyone running an online business should, at the very least, try to find themselves a profitable stream of traffic through Google AdWords. I deal with the specifics of paid advertising elsewhere in this book, but I want to mention the importance of associating your AdWords and Analytics accounts here.
Why do this? One of the biggest reasons is that you’ll get access to a Google Analytics column called ROAS (“Return On Advertiser Spend”), which is basically a god figure for advertisers. ROAS is calculated by dividing the revenue generated by your Google AdWords traffic[^3-1-3] by the cost you incurred in buying that traffic.[^3-1-4] The reason this figure is so great is because it sums up the profitability of your advertising by AdWords campaign, by keyword, or even by landing page.
An ROAS of 100% means that your revenue was equal to your advertising spend. You’ll want an ROAS of at least 100%, if not significantly more, so as to account for your other expenses in addition to advertising costs. If you know your average profit margins, then you’ll be able to calculate a minimally acceptable ROAS figure, and then ensure that you only advertise whenever this figure is exceeded.
You can access the ROAS column by viewing any of the Analytics reports filed under Acquisition > AdWords and then clicking on the “Clicks” tab within the “Explorer” bar. After doing this, the ROAS column (and a few other related ones) will appear in the main data table.
Yet another good reason to associate your AdWords and Analytics accounts is to unify conversion tracking between the two platforms. For readers not so familiar with these platforms, you should know that both have their own way of tracking conversions: AdWords has its “conversions” feature and Analytics has its “goals”/”e-commerce” tracking features. Theoretically, you could use the conversion tracking provided on both platforms, but this isn’t recommended. For one, the AdWords conversion tracking features are not as flexible as the Analytics ones, and this will make integration more difficult for your tech team. Moreover, simultaneously tracking conversions in AdWords and Analytics creates the risk that the figures recorded in one platform start diverging from those in the other. This risk is especially heightened after you make changes to your website that require you to update your conversion tracking (it’s possible to forget to update things for one of the two platforms). Unnoticed problems like these could lead to you getting different conversion numbers depending on where you look. Save yourself the hassle and skip the AdWords conversion features completely—it’s easier to set up Analytics goals and link them to your AdWords account.
Convinced? Great! A guide to associating these two accounts can be found by following the link in the footnotes.^3-1-5
Search Console
Google Search Console (formerly known as Webmaster Tools) helps you monitor your site’s presence within Google Search results, helping you answer questions like “which queries caused my site to appear in search results?” or “how highly do I rank in the results for this query?” or “how many people clicked through to my website after seeing it mentioned within the search results?”
Although this information is already available within the Search Console[^3-1-6] platform, it is nevertheless helpful to have it available in Google Analytics so you have everything in one place. This is done by asking Google to associate your Search Console account from your Analytics admin page.[^3-1-7] Please be aware, though, that because of privacy concerns, Google Analytics places heavy limits on how you can associate your search results data with other metrics in Analytics. If you run into a block, it probably isn’t a bug…
Location of Snippet in HTML
Your Google Analytics code should appear inside the
section of each page’s HTML. This is because code which appears higher up in HTML gets executed sooner, and code that gets executed sooner is less likely to be skipped if a visitor happens to navigate to another page before the current one has finished loading. By having the Analytics code high up in the page, you decrease the chances that Analytics will underreport hits or conversions.Implementing this isn’t as simple as copying and pasting the Analytics snippet high into your website’s
and then calling it a day. Problems appear when you have an advanced Analytics installation that requires you to send additional data to Google (e.g., ecommerce revenue data, events, custom dimensions, or User IDs). In order for this extra data to get sent, it must be already prepared and available within the website , and this requirement can have major consequences for your website’s architecture or caching layer.Cross-Browser and Cross-Device Tracking (User ID)
Gone are the days when people interact with your business from their one and only computer. Instead, they first encounter your website on their phone, next download your app, and then finally complete a purchase on their laptop a week or two later.
By default, Google Analytics records these three interactions as coming from three distinct users. But this is a distortion of what actually happens. Without any corrective intervention, Google Analytics will give the impression that the traffic from your mobile website and your app was totally worthless, whereas the traffic from your laptop users was gold. But the truth is that the mobile traffic was crucial for making first contact, and there would be no eventual purchases from laptop devices without this step.
Why does Google Analytics falter in this way? This comes down to the technicalities of how Google ties browser sessions together into overall user stories. Whenever someone visits a website that has Google Analytics installed, Google drops a cookie on that visitor’s browser. Should that person happen to visit the website again, Google will notice that a cookie was already dropped on the browser, and they can use this data to link the earlier session to the current session. But there is a big limitation to all this: Due to how cookie technology is designed, cookies are specific to one browser (i.e., they are neither shared with other browsers running on the same device nor with browsers running on other devices). This means that Google Analytics is unable to stitch sessions together which occurred on multiple devices or on multiple browsers. And to further complicate the picture, even the cookies within the one browser can expire or get deleted.[^3-1-8]
Google Analytics has a feature called User ID that can overcome these difficulties and help you stitch together sessions occurring on multiple browsers/devices that would otherwise appear separate. To use this feature, your website must generate a unique ID for each user and send that ID to Google Analytics. If tracked traffic from your iPhone app sends the same User ID as the traffic from your regular website, Google Analytics can see that both incidents belong to the one story. Be careful that you don’t send Google a User ID that contains information that would enable them to personally identify this person (e.g., a username or email address). Instead, send an identifier only known to you (e.g., an ID number you generate in your database).
There is a proviso for this feature to work: You have to be able to reliably create, store, and assign a unique ID to your users during every session. The most typical implementation of this is to tie in the Analytics User ID to whatever ID you use internally to manage logged-in user accounts.[^3-1-9] This makes sense because it leverages existing login and database functionality, but it has the downside that your tracking will not account for visitors who don’t log in during a particular session. (However, this setup does manage to capture visitors who begin a session while logged out but log in later during the same session.[^3-1-10])
After you set up User ID tracking on Google Analytics,[^3-1-11] you will be encouraged to create a separate “view” of your Analytics data that will be limited exclusively to the visitors for whom you have User IDs.[^3-1-12] When you switch your Analytics reports to this view, you’ll see all your usual reports, along with a few new ones which would be otherwise unavailable. These are filed under Audiences > Cross Device.
The Device Overlap report lets you answer questions like “what is the conversion rate for customers that interact with my business through the combination of devices X, Y, and Z?” (You might learn that you have a 1.2% conversion rate for users that only use desktop and 1.3% for users that only use mobile, but a 3.4% conversion rate with users that accessed your website on both desktop and mobile. This would indicate that you should advertise on both platforms simultaneously.)
The Device Acquisition report helps you understand the value of various “first contact” devices. You can answer questions like “how much revenue do I eventually earn when a user first encounters my business using device X?” If you know that the users who find you on mobile are the most likely to eventually buy, then you should focus your “fresh leads” advertising budget on this more profitable group.
The Device Paths report lists which devices were used (in what order) within the steps taken before (or after) some goal completion, event, or ecommerce transaction. This information could let a SaaS (Software as a Service) company know which devices to prioritise for each goal type.
A word of warning: User ID tracking isn’t and cannot be retrospective, so be sure to have it set up before you launch your website.
Subdomain Blues
Conflation of traffic with the same relative paths on different subdomains
By default, Google Analytics records only the relative path of a hit (e.g., “/about_us”) instead of the full path with the host name included (e.g., “www.mysite.com/about_us”). This becomes problematic when your website is hosted across multiple subdomains. As far as a Google Analytics account configured to relative paths can see, hits to “blog.mysite.com/index.html” are indistinguishable from hits to “www.mysite.com/index.html”. In its eyes, the URL “/index.html” was simply visited twice.
This is obviously not what you want, and this sort of conflation can invalidate your reported data by incorrectly bucketing together hits to distinct entities. The solution to this conundrum is to configure Google Analytics to record the full URL instead of just the relative path name. This is done with a filter that rewrites the “page” dimension within Google Analytics so as to include the host name. ^3-1-13 Be warned though: Adding this filter to an existing account will invalidate existing Google Analytics goals (or other filters) that are programmed to trigger on exact matches of URLs. For example, a goal configured to trigger when the page exactly matches “/thanks_for_buying” will no longer trigger when the page data becomes “www.mysite.com/thanks_for_buying” because the additional host name information means a character-for-character perfect match no longer exists. This means that your Analytics installation will stop recording goal completions until you update your goal definition to work in harmony with full URL names.
Session data resetting across subdomains
Whenever a visitor navigates to a different subdomain on your website, their session data will be reset, meaning that, as far as Analytics is concerned, a separate visitor arrived, and this “new” visitor was referred to your website by your other subdomain (instead of by the true outside source like blog X or social network Y or advertising campaign Z). This session resetting causes all sorts of distortions in your reported data. For example, it will decrease the reported session length of each visit, it will clobber over information about which marketing channel originally delivered the visitor to you in the first place, and it will mess up analysis of the value of each marketing channel.
The fix to theses ills comes in the form of configuring Referral Exclusion Lists for your various domains[^3-1-14] and integrating Google’s autolink functionality.^3-1-15
Cleanup Noise, Chaff, and Distortion
Lowercase Request URL
In the world of URLs, character case matters. By way of example, the following three URLs count as separate entities:
/products/iphone
/products/IPHONE
/products/Iphone
And because these three are considered separate entities, Google Analytics dutifully records them all as distinct items in its reports. This would be a good thing if your website showed different content depending on the case of the URL. But such a website architecture is rarely done (and is terribly confusing when it is done), so I would argue that Google Analytics’ default treatment is going to spell trouble for the typical webmaster. This is because its effect is to pick up URLs entered incorrectly by your users (or linkers) and cause their tracking data to disperse across the variously cased URLs in your reports, ultimately causing your website’s performance to be more difficult to analyse.[^3-1-16]
The solution is to merge all these separated entries into a single Google Analytics entry by creating a custom Google Analytics filter that lowercases the request URL.[^3-1-17]
That said, this problem should really be rectified within your web application’s software layer. Your website should only respond with HTTP status code 200 to lowercased URLs; when someone enters an uppercased variant, your website’s router should redirect (i.e., HTTP status code 301) to the lowercase version, or just not respond at all. But sometimes making this software change isn’t possible or practical, and in these cases the Analytics filter in the preceding paragraph will get you out of a fix.
Lowercase all UTM parameters (URL Builder parameters)
UTM parameters smuggle extra data into your Google Analytics reports by appending information to the end of your URLs. These parameters, which you’ll either set yourself or have platforms like Google AdWords set automatically for you, contain information about the source of the traffic (e.g., marketing channel, advertising campaign, or CPC keyword that generated the traffic).
I bring up UTM parameters here because there is a danger that your UTM parameters may appear lowercased in one advertising campaign and upper/mixed cased elsewhere (e.g., utm_source=email in one place and utm_source=Email elsewhere). This would cause Google Analytics entries for the email source to be split across two places (“email” and “Email”), making analysis annoying to carry out and dramatically more error-prone.
While it’s certainly possible to instigate company policies which demand that you and your staff always lowercase UTM parameters, realistically speaking, you can’t guarantee that this policy will be consistently applied—after all, human memory is oh so fallible. A better solution would be to create filters within Google Analytics that automatically lowercase each of the UTM parameters, thereby ensuring that this problem can never occur.[^3-1-18]
Remove sorting/pagination/noisy parameters
In some quarters of your website, your software might automatically append parameters to the URL. The classic example is when you have a listing page with the option to sort the listed products by price, size, or some other attribute. A common implementation for this feature involves redirecting users to a modified URL which has been updated to include these sort parameters. (The reasoning for such a URL design is that appending the sort parameters to the URL enables the user to bookmark or share that particular view of the page.)
Here’s an example of one such URL: A student shopping for medicine textbooks sorts by price in ascending direction, and you can see that these sorting instructions are encoded into the end of the URL:
/taxons/medicine?direction=asc&sort=price
Google Analytics (as with the rest of the internet) treats each URL as unique, so, within your reports, you will see separate entries for every combination of sorting parameters. This means that your report could contain distinct entries for
/taxons/medicine?direction=asc&sort=price
/taxons/medicine?direction=desc&sort=price
/taxons/medicine?direction=asc&sort=institution
/taxons/medicine?direction=asc&sort=name
But this is not what you want, from an analytical point of view, because it means that your data about the “/taxons/medicine” page will be dispersed across four entries and correspondingly awkward to work with. From the business point of view, each of the above four entries contains the same content (i.e., medicine textbooks). The various entries in the Analytics report only represent different orderings of this same content. It is more useful for you, the marketer, to bucket all the above into a single Analytics entry for
/taxons/medicine
Luckily for us, it’s possible to configure Google Analytics to automatically collapse URLs that differ only in sort (/filter/pagination parameters) into their proper catch-all URLs. To do this, you simply tell Google Analytics the names of the URL parameters which you want it to ignore. This is done through the Exclude URL parameters feature.[^3-1-19] In the above sorting example, we’d tell Google to ignore the “direction” and “sort” parameters, and this instruction would collapse the four entries into one for “/taxons/medicine”.
Some caution is advised in setting up these filters because they apply not just to parameters in individual URLs, but also to parameters contained in any URL across your whole website. If you aren’t careful, you might unintentionally collapse together URLs elsewhere on your website that really ought to appear distinctly in your reports. For example, what if “sort” was used to mean “type” or “variety” in some other part of your website (e.g., for product variant pages like “/products/Xlaptop?sort=15-inch-screen” and “/products/Xlaptop?sort=13-inch-screen”)? You may well want these two variants to appear as distinct entities in your reports.
The best way to avoid these kinds of unintentional clobbering problems is to be consistent with the parameter names used on your website for sorting, filtering, or paginating content. This is something that should be thought about up front when programming and designing the URL surface area of the website. If the parameter name “sort” is used for sorting data, treat this as reserved and ensure it isn’t used elsewhere to mean something else. (Ditto for “page” and other parameters).
Above, I talked about filtering out parameters for reordering content (such as sorting and filtering). But there may be other kinds of pesky parameters that add nothing to reporting, and that you’ll consequently want to exclude. Indeed, anything that’s noisy, unimportant, or that leaks sensitive information (such as customer names) ought to be expelled from the URL. This means that you will probably want to strip out parameters like the following:
order\_token=
utf8=
affiliate\_token=
customer\_email=
Exclude bots
The internet is inhabited not just by humans, but also by bots. These robotic ghosts will visit your website day and night and thereby inflate your view counts. While this is great for bragging rights and for impressing hopelessly naïve investors, ultimately, the view count boost is a Pyrrhic victory because these bots will taint and distort all your other statistics, like traffic acquisition channels, conversion rates, bounce rates, and session durations. Your reports, now muddied, will leave you with incorrect impressions about your website, and any intelligence you once gleaned from studying your metrics will now be lost—unless you configured Google Analytics to exclude bots.[^3-1-20]
Exclude internal IP address from your live tracker
Whenever you visit your own live website, you skew your Google Analytics results ever so slightly. This happens because your activities do not represent those of the typical user. The average user, after all, does not spend full days proofreading, tinkering with the CSS, or checking each page’s performance in Chrome DevTools.
More problematic are the effects of internal automated tools that interact with your live website. I have in mind here cache-warmers or stress-testing suites, both of which deliver a lot of unrepresentative hits.
For these and other reasons, you should exclude internal traffic from Google Analytics reports. Assuming you work from a static IP address, you can sort this out by setting up some IP filters.[^3-1-21]
Set up a separate web property in Analytics for staging/testing
Absent any special precautions, all your activity on testing/local/development versions of your website will pollute your live Google Analytics data. When your programmer adds something to the cart 20 times during a bug fix, your business statistics on Google Analytics will be similarly inflated.
There are a few ways around this, but the most useful is to send data from test versions of the website to a separate Google Analytics tracking profile.[^3-1-22] This has two advantages. Firstly, it stops you polluting your live data. Secondly, the new “test” profile will help you verify—before deploying—that Google Analytics will continue working after you deploy your most recent changes, helping you to catch potential reporting bugs before they do irreversible damage.
Exclude unwanted referrers
Every time a visitor leaves your website only to later return, Analytics tracks a new session, associating it with a new “referrer” website. Normally this is perfectly sensible and correct, but there are a few isolated circumstances when recording a return visit in this default manner will distort your reporting data. Suppose you are using an off-site payment provider, like Paypal. Someone comes to your website via a Facebook advertisement, adds a product to the cart, checks out, gets redirected to pay on Paypal, then returns to your website and sees the thank-you page. By default, Google Analytics views this last hit as a brand new session, referred by Paypal.com. If your website displays a conversion pixel on the thank-you page (as many websites do), then this conversion will be attributed to traffic referred by Paypal instead of traffic referred by your Facebook advertisements. This makes it look like Paypal.com is a major referrer of converting traffic, when in reality it was Facebook that brought home the bacon. The worst part of it all is that the traffic coming from Facebook advertising is reported as accumulating in nothing but cart abandons, occurring on the payment page.
The best solution to this distortion is not to use off-site redirects within your flows. But if it isn’t possible to remove these, you can still prevent this reporting problem by adding a Referral Exclusion for paypal.com.[^3-1-23] Setting this exclusion causes the user session to continue after the visitor returns from Paypal, meaning that Google Analytics will attribute the conversion to the original referring source, Facebook, instead of to Paypal.
Analytics and Social Media
Social interactions
Google Analytics has a feature to record social network interactions such as Facebook likes/shares or Twitter tweets. Why are these social media interactions worth tracking? Because Google Analytics gives you convenient reports about your social media performance. This convenience stems from you being able to see—in the one Analytics report—how each of your pages fared. This beats flicking from page to page, especially if there are tens of thousands of pages on your website. Furthermore, manually checking the counters on each page is increasingly becoming impossible because social media platforms such as Twitter are starting to remove the counters from their on-site widgets. When these restrictions are in place, you’ll have no idea about how often some piece of your content was shared–unless you are tracking it with Analytics.
The main advantage of tracking shares is knowing what type of content is striking a chord with your readership so you can create more content like it in the future. There’s also a smaller advantage from the point of view of website design: Knowing which network share buttons get used and which ones don’t will prompt you to remove the underperforming buttons so as to reduce website clutter.
The Analytics report isn’t totally comprehensive though: It only tracks interactions that occur when someone uses the social media plugin installed on your website. The report knows nothing about social media interactions that happen elsewhere (e.g., retweets that originate on Twitter or manual copy-and-pastes of your URL into a Facebook status update).
Social interaction tracking is implemented using Javascript that subscribes to social plugin events and forwards this information to Google Analytics.^3-1-24 For marketers who don’t want to mess with setting this Javascript up manually, there is a lazy option (the Autotrack plugin) that requires no configuration and is available to anyone using the official Twitter and Facebook buttons.^3-1-25
Events for Remarketing
The fact that someone is sharing your website on social media shows that they are highly engaged with your brand and appreciate your work. What’s more, they are part of the “sharer-erati”, the active and vocal subset of internet users predisposed to sharing content. Who better than them to remarket future content to?
The logical thing would be to build a remarketing audience in order to target those that Google Analytics records as having shared through their Social Interactions feature (as we set up above). Unfortunately, it isn’t possible to create an audience from this data, so we must instead create a Google Analytics event whenever someone shares content, and then build our lists up out of these lists.
Custom Dimensions
Even without any customisation, Google Analytics comes with a great many built-in dimensions for analysing and segmenting your data (e.g., country of visitor, referring website). But despite all this, your efforts at gaining insight might nevertheless be hampered by the lack of some crucial piece of information specific to your business. Custom dimensions fill this gap by giving you the option to send Google Analytics additional information for your reports.
Let’s look at some use cases that demonstrate how you might use this feature. Imagine you own a newspaper website and have a bunch of different columnists writing articles for your paper. Because talent is never evenly distributed, some authors will be better at engaging readers than others, so it’s important for you to know who these star performers are. By default, Google Analytics only records the URL names, but this doesn’t do you any good because your website makes no mention of the author’s name in the URL (you want to keep your URLs short and sweet instead of turning them into overladen mules for hauling information to Google Analytics[^3-1-26]). Essentially there’s no practical way for you to know how engaging the public finds each of your authors—that is, unless you add a custom dimension for “author name” that gets sent to Google Analytics along with every tracked hit. With this in place, you can start segmenting and studying your data along these lines.
Here’s a second example: Suppose you own an ecommerce store that sells books in various genres (“humour”, “politics”, “business”). Without any custom dimension for “book genre”, you would be unable to tell how each genre performs relative to the others in your Analytics reports.
Illuminating Analytics Blind Spots
Virtual pageviews
Online marketers often have the impression that including the Google Analytics snippet into their website templates is enough to track each and every page view occurring across their websites. Unfortunately, it’s not that simple.
Let’s take a demonstrative example: Suppose you have a “contact us” form found at the URL “/contact_us” (via a HTTP GET request). Your website is designed such that whenever a question is submitted through this form (via a HTTP POST request), the customer gets redirected back to the “/contact_us” page, where they see the same “contact us” form again, except this time there is a little flash message at the top of the screen that reads “Your question was successfully received”. From a human and common sense point of view, it’s clear that two distinct things happened: 1) Someone viewed the “contact us” form and 2) someone sent a question through that form. But from the point of view of Google Analytics, which only has the URL information to go by, it just sees that the “/contact_us” URL was visited twice; it cannot know that the second hit, recorded after the question was sent, represented a qualitatively different event. Information is thus lost.
This sort of inaccurate reporting doesn’t just happen with respect to forms that submit data to the same URL as where the form was found—it also affects heavily AJAXed components, like one-page checkout processes. Not only that, but Google Analytics usually can’t track hits to non-HTML content on your website, such as MP3s or downloadable PDFs. This is because these sorts of content are usually served outside the frame of your website template, and so your Google Analytics tracker is unavailable to register hits.
The most generally applicable solution to these problems comes in the form of virtual pageviews,^3-1-27 a Google Analytics feature whereby you insert Javascript code that registers page views for pages that “don’t exist” (from the URL-centric point of view)—thus explaining the “virtual” in this feature’s name. Returning to our original example, after someone submits a question on our contact form, we could have our code send the flash message via Javascript (so as not to create a duplicate hit to “/contact_us”), and then we could record a virtual pageview for “/contact_us/submitted” (even though the URL “/contact_us/submitted” doesn’t actually exist on our website).
Those with single-page applications will, no doubt, be daunted by the idea of writing exhaustive code for tracking virtual pageviews. Luckily, Google maintain a Javascript library called Autotrack that might be able to take care of this job for you.^3-1-28.
Track important occurrences that don’t correspond with page views
Imagine a subscription business (like a web hosting company) that bills their customers monthly for access to their services. During the lead up to a subscription, the customer-to-be would have visited the company’s website to provide their credit card details and initiate the initial billing, with the consequence that this first conversion can be tracked as usual over in Google Analytics. But during the subsequent months there is no further need for this customer to revisit the website; they can simply enjoy their web hosting service and get on with their life.
This scenario presents a tracking challenge. Google Analytics normally only sees occurrences that correspond with explicit user activity on a website or app. As a result of this blinkered vision, any activity that happens behind the scenes will not be recorded. This leaves certain metrics, such as the accumulated revenue generated through monthly billing cycles, invisible to Analytics reports. Without these data points, Analytics reports risk presenting an incomplete or even misleading picture. For instance, patchy data might suggest that advertising campaign A (“half-price basic hosting for the first three months”) radically outperformed campaign B (“superior reliability and data privacy”) in terms of generating revenue. But maybe it turns out that the customers won through the first campaign are fickle and leave as soon as the discount expires, whereas those acquired in the second campaign are more liable to stick around for a longer time since they bought on the basis of reliability and privacy. Because the web hosting business is all about lifetime revenue per customer, their goal is to optimise for long-term contracts. But without any information about subsequent billing cycles (or churn) available in their Analytics reports, this analysis task would be near impossible.
There is a solution to this conundrum: The Analytics Measurement Protocol. Roughly speaking, this Analytics feature gives your programmers the power to send tracking data about any one of your customers to Google Analytics at any time whatsoever—even if there has been no corresponding bout of web activity. Tracking information could be sent, for example, as part of an automatic billing process that happens once a month behind the scenes in your backend server.
You may be curious as to how Google Analytics can associate the data sent by your backend server with the regular tracking activity that occurs on the frontend. This is made possible whenever the website is programmed such that both the backend and frontend tracking systems pass Google the same User ID information, which Google then uses to stitch the two streams of events together. (This User ID is also the mechanism used to stitch together sessions occurring on different devices and browsers.)
Tracking with the Analytics Measurement Protocol has use cases beyond those of the company selling subscriptions, as we described above. Consider, for example, a lawyer specialising in workplace injuries who decides to start advertising online. Far away on a distant website, some lead sees this lawyer’s adverts and clicks upon one of them. If the lead is impressed by what he or she reads on the lawyer’s website, then the lead will send the lawyer a question through the website “contact us” form. The lawyer reads this message and phones the lead back. If this first conversation goes well, the lead eventually engages the lawyer’s services.
Sales processes like the above cannot be comprehensively tracked with a standard installation of Google Analytics. As far as Analytics is concerned, any activity occurring after the “contact us” form never happened; these events are completely invisible, completely under the radar. This abrupt cut-off in tracking leaves the lawyer’s data open to similar kinds of misinterpretations, as we saw above with respect to the subscription business. For example, our lawyer, seeing that a certain low-cost advertising campaign caused many leads to send her messages, might end up diverting all her budget to these adverts in order to receive even more messages for her money in future. But what if, unbeknownst to her, the leads she gets from these advertising campaigns rarely buy legal services and instead only want to waste her time by asking for free advice? As it stands, this question cannot be answered; but had our lawyer integrated elements of the Analytics Measurement Protocol into her workflow, then she could simply read the answer off a report. (Exact implementation details for this might involve building an admin-only button that sends over tracking info to Google whenever some lead or other signs a contract for services.)
Collect Your On-Site Search Queries
Searches through your on-site search engine reveal useful information about what your visitors are looking for. The enterprising website owner will harvest these clues and use them to inform future inventory purchasing choices, future website design (e.g., moving commonly searched-for items to the home page), or future online advertising campaigns (e.g., mentioning the most commonly searched-for items within the advertisement copy).
With Google Analytics, you have no need to build custom server-side machinery to harvest this data; instead, you can configure Google Analytics to record all on-site searches for you.[^3-1-29] Once configured, you can see what searches were most often made by your visitors, then sort and filter with respect to any other data point recorded in Analytics.[^3-1-30]
One proviso: For this feature to work, your website architecture must display its search results on a URL which contains the search query parameter.
/search_results?keyword=philosophy
This is instead of an alternative implementation which might swallow the query parameter silently and display results on a generic URL like:
/search_results
Our initial architectural suggestion, which included the search query within the search results URL, makes sense for usability reasons too. Specifically, it allows visitors to link to particular search results pages, to bookmark them, and to send them to their friends or customer support.
Measuring Success
Setup goals
Google Analytics has a feature called Goals which you can configure to correspond to business targets like signups, purchases, leads, or any other measurable you want to improve. With Goals set up, you can study how pretty much any other Google Analytics dimension influences goal completions. For example, you will be able to tell how different marketing channels, visitor countries, or months of the year affect your conversion rates, enabling you to adapt accordingly.
There are a few different ways to configure Analytics to count a goal as having been completed, the simplest being noting when the user has reached some particular URL. To avail yourself of this simplicity, it helps to have designed your website such that goal completions correspond to URLs that are shown one time (and one time only) during some particular user journey. For example, this might entail redirecting a customer to a “/thanks_for_ordering” page immediately after payment has been made.
Track ecommerce transactions
Probably the most important business metric you’ll want to keep tabs on is ecommerce transactions (i.e., orders). But instead of creating Analytics Goals to track these transactions, you are better advised to implement Google’s special ecommerce tracking plugin.^3-1-31 Essentially, this plugin lets you record detailed info about what products were bought, in what numbers, and how much revenue their sale generated. As with goals, you can associate this data with pretty much anything else in Google Analytics, enabling you to correlate revenue with, say, landing page, marketing channel, or device used to browse website. This is analytical optimisation at its best.
Tag URLs
Imagine an ebook author who, during her launch week, sends out two different emails to her mailing list of 5,000 fans asking them to buy her book through a link in the email. At the end of the week, the author sees that she sold 1,000 books in total. However, she also learns that 500 people on her mailing list unsubscribed. The author is left with a few burning questions: Which of her two emails was better at triggering sales, and which one led to more unsubscribes? With a default Analytics setup, she cannot answer these questions since the traffic from both emails will appear lumped together under the one bucket: “direct” traffic. Not only will it be impossible for the author to distinguish the traffic coming from each of her emails, but it will also be impossible for her to distinguish email traffic from traffic arriving when a user types her website’s URL directly into their browser. What a mess! Luckily, there is a way out of this conundrum: Our author could have tagged the URLs she sent out in her emails, and then Analytics could pick up on these tags and partition the traffic accordingly.
Before we explain how exactly URL tagging works, I’d like to quickly explain another important use case for this technology: that of tracking the root referrer of traffic, as opposed to the previous referrer. An example makes this idea clearer: Imagine our book author posts to her Facebook page a witty post containing a link to her website. This post ends up getting shared hundreds of times on Facebook, and some of these shares end up getting tweeted. Now imagine a random Twitter user sees a tweeted version of the original Facebook post and clicks on the link within, bringing them to her book’s website. The previous referrer (i.e., the directly immediate one) is twitter.com, and that’s all the information that would be available within a default Analytics report. But with URL tagging, the author would be able to attribute this visitor’s arrival as being ultimately attributed to that earlier Facebook post. In other words, URL tagging can be used to preserve information about the root referrer.
What is URL tagging? In abstract, it’s an information-preserving hack that stows away valuable data within the URL. This info can later be picked up and unpacked by Analytics. Instead of sending out generic links to some page on your website (e.g., to https://www.example.com/buy\_book), you leave slightly different versions of the link in different places (e.g., one version of the link is given in your first mailing list email, another version is given in the second, and a third version is given in your Facebook social media campaign). These links all lead to same page on your website (i.e., to the book buying page), but they differ in that each is dressed up with different extra titbits of information called “tags”, which happen to be stored via query parameters appended to the end of the URL. Here’s an example with just one parameter appended: https://www.example.com/buy\_book?utm\_campaign=newsletter-jan-8.
Most typically, the tags used are the five popularised through Google’s URL Builder (we’ll describe these below).^3-1-32
Someone who has followed a tagged URL to your website will ask your server not for the standard URL (e.g. https://www.example.com/buy\_book), but rather the tagged URL (https://www.example.com/buy\_book?utm\_campaign=newsletter-jan-8). This expanded, tagged URL gets tracked in Google Analytics as the visited page (instead of the untagged original), and Google can now parse the information out of these tags. In this case, Analytics knows that the “utm_campaign” was equal to “newsletter-jan-8”.
In the remainder of this section, I’ll elaborate on the mainstream Google Analytics URL builder approach, but marketers using other analytics platforms can, of course, set their own parameters and program their software to parse these instead.
Google Analytics URL Builder parameters
A few words to Google AdWords users: Be aware that AdWords can be set to automatically fill in these tags for you.^3-1-33 Compared to setting the parameters manually, this saves you a lot of time, but you should be aware that automatic tagging constrains your usage of the five URL builder parameters such that they must be consistent with Google’s ideas—otherwise you’ll create a horrific mess in your Analytics reports.
That leads us to our next point: What is the intended use, as per Google, for each of their five parameters?
Campaign Source: (Required parameter) Represents the platform that sent you traffic (e.g., “google”, “facebook”, “twitter”, or “linkedin”). If you’ve bought advertising or done deals with smaller websites—be that for banner ads, guest blog posts, or newsletter appearances—write the names of these websites here. Indeed, it’s possible that you might be the owner of the campaign source website, as could happen when you promote your new B2B startup on your personal website. When marketing through email campaigns, it’s tempting to write “email” as the source (or even the name of the company delivering the newsletter, such as “mailchimp” or “sendgrid”). Don’t do this. Instead, write down a descriptive name for which mailing list you are promoting yourself on—is it your “pre-launch-list” or your “new-rental-alerts-subscribers”? By tagging as above, you’ll be able to determine how successful each traffic source has been, and you’ll be able to apportion future marketing spend accordingly.
Campaign Medium: (Required parameter) Represents the mechanism used to deliver your advertising, such as “paid-advertising”, “social-media-post”, “banner-ad”, “offline”, “email”, or “affiliate”. In many cases, Google Analytics already assigns a campaign medium without your doing anything. You’d be advised to adopt their conventions:
“direct” (when someone typed your URL into the web browser, or visited a bookmark they previously left to your website, or clicked on a link to your website from an instant messaging app)
“organic” (traffic from search engines)
“referral” (traffic from other websites which link to you)
“cpc” (cost per click)
“social” (i.e., social networking websites like Twitter, Quora, and Reddit)
“email”
The point of the campaign medium dimension is to determine which mechanism for delivering marketing campaigns gave you the highest return. Does your Christmas sale campaign perform better through the “email” medium, or through “cpc” or “offline-postering”? Campaign medium is there to help you find this out.
Campaign Name: (Required parameter) Used mostly for distinguishing between efforts that would otherwise be impossible to tell apart (e.g., different email newsletters, different postering campaigns, and, most commonly, different CPC advertisements). Often, the campaign name will mirror that of specific categories of products, leaving taggings like “used-iphones”, or “used-macbooks”. Within my own campaign name tagging, I like to include information that helps me distinguish between remarketing and non-remarketing advertisements. I do this by appending the abbreviation “rm” onto my remarketing campaign names (e.g., “used-iphones-rm” vs “used-iphones”).[^3-1-34]
Campaign Term: (Optional parameter) Called campaign term because Google AdWords auto-tags this with the search term that triggered that particular click, enabling you to better understand what keywords you should advertise on. I would only fill in this tag when you are targeting keywords (or their equivalent, such as likes) on some other advertising platform.
Campaign Content: (Optional parameter) Google AdWords automatically populates this field using the headline of your adverts. Given the ever-expanding variety of advertising headlines you’ll experiment with in your quest for high conversion rates, this leads to a semantic mess. You are best off completely ignoring this parameter.
Dealing with grossly oversized URLs
A tagged URL can end up long, cumbersome, and unsightly. Because of its size, it looks unprofessional when shared online, and it can be completely unwieldy when printed out for offline campaigns (e.g., when you want to track the success of your postering campaigns).
To restore concision, you can use a URL-shortening service like Bitly (https://bitly.com/) to package the URL into a bite-sized bundle, ideal for sharing (especially offline, where you cannot expect someone to write out a massive URL that includes tagging parameters).
Tagging consistently
The URL builder is a simple tool that doesn’t remember your past preferences for how you tagged URLs given out in previous marketing campaigns. This opens up a breeding ground for inconsistency should URLs be tagged one way today and another way tomorrow. If campaign source is “facebook” today, “Facebook” the day after, then “Face book” the third day, then data which should appear clumped together under the one category will end up unnaturally dispersed across three.
To avoid this pitfall, create a shared spreadsheet that records exactly how your company tagged URLs in previous campaigns. Doing this guarantees reporting consistency to any marketer with the discipline to reference this spreadsheet and the intelligence to write new tags such that they sit harmoniously with the old.
Better yet, this spreadsheet should be programmed with macros that automatically generate the tagged URLs based on parameters the spreadsheet user places in other cells. This process beats giving yourself arthritis by using Google’s URL tagger to individually tag each and every URL.
Precautions
Google Analytics data, once written, can never be modified, deleted, or edited. If you slip up, you’ll have to live with the consequences of your mistakes forever.
Generally speaking, there are two ways to mess up Analytics data. The first is mistakes in your own website’s code (e.g., an on-site bug reports ten hits to Analytics for every one hit on your website). The second category of blunders is mistakes in your Google Analytics account, the most common example being when you create an incorrect Google Analytics filter. (You are probably accustomed to thinking of software filters as windows that apply over underlying data, changing what you see right now but leaving the original intact. But Google Analytics filters are different—they permanently modify or remove data, so getting one of these filters wrong will obliterate your data.)
Early detection of error is the name of the game. Best of all is detection before code is deployed to the live server and permanent damage is done. This could be achieved by having a policy of peer-reviewing recent changes or writing automated tests that probe your Analytics implementation for errors. Next best after pre-deploy detection systems are damage minimisation systems designed to quickly detect problems as they occur on your live server. This could be achieved by having automatic alerts that get sent whenever Analytics data is out of normal bounds, or by having someone on your team manually monitor the “real time data” report during the minutes after a dangerous change was made to the website.
With regards to the second broad category of errors—those occurring within your Analytics account—you can protect yourself by creating backup profiles. For instance, you could have a separate Google Analytics profile named “Unfiltered Data” that stands back and gathers all your data without the potentially damaging effects of any special filters or other special Analytics features. The purpose of this profile is nothing other than to serve as the ultimate backup and source of truth if all else fails. If you want to be especially careful, you could also create new “testing the waters” Analytics profiles before effecting any potentially damaging change. Once you are comfortable that your planned change won’t do any collateral damage, you can switch out the “testing the waters” profile for your business-as-usual primary profile.
Be Disciplined in Making Annotations
Anomalous occurrences in your business’s life—such as massive once-off email marketing campaigns, major press coverage, or technical issues that affect tracking—can create massive spikes or sudden gaps in your Analytics data.
In the months or years that follow, these anomalies will confuse you (or the employee taking over your role). Worse yet, if not accounted for in future analyses, these anomalies can contribute to you making false conclusions about your business and thereby making poor business decisions, all the while thinking your error is justified by the data.
These future problems can be mitigated by adding annotations to your Analytics account. These are nothing more than simple text notes, placed at certain dates, whose purpose is to alert future readers about the anomaly.
Send Alerts
Google Analytics can be configured to email you whenever something remarkable happens. I will suggest four basic alerts that every webmaster can benefit from.
Alerts when there is no traffic at all over the past 24 hours. At their most dramatic, these alerts inform you about catastrophic infrastructural bugs that take down your entire website. More commonly though, these alerts inform you about bugs in your Google Analytics implementation (e.g., your tracker unexpectedly stopped showing in your HTML template). Knowing about these Analytics problems 24 hours after they occur is important, because it keeps the damage to your data contained. (There is no way to retrospectively repair Analytics data…)
Alerts about there being no conversions in the past hours/days. These would inform you of newly introduced problems that seriously affect the key flows on your live website (e.g., issues with the checkout process preventing purchase conversions). Think of these alerts as basic health tests—assistants to automated software tests, if you will.
Alerts about there being a relative reduction in the conversions compared to a previous period (e.g., 30% down). These might inform you that a previous change in marketing strategy was a bad decision, as evidenced in the reduction in sales.
Alerts about traffic spikes. These will mobilise you to inquire about what caused the spike (e.g., a mention of your brand on Twitter by a big influencer), how you can capitalise on the buzz, and how quickly you can scale out the servers.
Analytics Blocking
If you’re scratching your head about discrepancies between your figures as recorded in Analytics and your figures recorded in your server’s database or logs, then take heed of the following: A small but rapidly growing proportion of internet users are impossible to track with Google Analytics (or similar technologies), owing to their use of tools for opting out of or completely blocking third-party tracking software. In 2016, Quantable found that as much as 11% of traffic is now untrackable, although they noted that the sample they based their calculation on was skewed towards younger, male, internet-savvy web surfers, the demographic most likely to install privacy-protecting software.^3-1-35 Regardless of what the exact percentage of untrackable traffic is in your field, analytics blocking is a development to keep your eyes on.
This content is an excerpt from EntrepreNerd, my book on internet marketing for programmers. Buy it now, you won't regret it! {: .janki-rule}
Further Reading
Shopify Facebook Marketing: Whereas the above focused on Google Analytics on your own website, sometimes you'll want to do analysis of a platform-hosted site (e.g. one on Shopify) that integrates with an online advertising platform, such as Facebook Ads. This lovely guide walks you through the issues.
Footnotes
[^3-1-1]: This metric is available only with the Analytics Ecommerce plugin.
[^3-1-2]: This assumes you have Google Search Analytics installed in a connected Google Search Console account.
[^3-1-3]: This assumes you have e-commerce tracking for Analytics configured.
[^3-1-4]: This data is imported from Google AdWords.
[^3-1-6]: Under Search Traffic > Search Analytics
[^3-1-7]: Details here: https://support.google.com/webmasters/answer/1120006?hl=en
[^3-1-8]: By default, Google Analytics cookies only expire after two years. Please consult this reference: https://developers.google.com/analytics/devguides/collection/analyticsjs/cookie-usage?csw=1\#analyticsjs
[^3-1-9]: An implementation that doesn’t depend on log-in functionality could be achieved, for example, by sending out bespoke URLs to each individual lead and then having the software assign unique User IDs based on which of these bespoke URLs was used to access the website.
[^3-1-10]: Through an Analytics feature called Session Unification, hits collected before the User ID is assigned can be later associated with the User ID once it becomes available. This helps you analyse the behaviours that lead up to the assigning of a User ID (e.g., actions taken before creating an account on your website). Read more: https://support.google.com/analytics/answer/4574780?hl=en
[^3-1-11]: Installation details: https://support.google.com/analytics/answer/3123666
[^3-1-12]: Frustratingly, Google Analytics disables goals and ecommerce tracking on new views, so be sure to re-enable them under Admin > Account > Property > View > Goals (/ E-commerce Settings).
[^3-1-14]: Admin > Tracking Info > Referral Exclusion List. See guide: https://support.google.com/analytics/answer/2795830?hl=en
[^3-1-16]: Of course, this will only be a problem if your website accepts URLs in various cases (i.e., responds with HTTP status code 200 instead of 404ing or redirecting).
[^3-1-17]: Do this by entering Admin > Views > Filters. Within the Filter Type field, select “Custom”, then select “Lowercase”, and then select as the Filter Field “Request URI”.
[^3-1-18]: This process of creating a filter is the same as in the previous footnote, except that you’ll now need five separate filters, one for each of the UTM parameters: Campaign Medium, Campaign Source, Campaign Content, Campaign Term, and Campaign Name.
[^3-1-19]: Go to Admin > View Settings > Exclude URL parameters. If you are enabling this feature for an existing website, you might want to do a full audit of all the existing URL parameters ever recorded by your Analytics installation for this website. Do this by switching to the all-time view of your data and then going to Filter Behaviour > Site Content > All Pages. Next, search for the telltale symbol that syntactically indicates the start of a parameter listing in URLs: “?”. The results of your search will show all the URLs stored in your Analytics view that contain query string parameters (e.g., “?sort=“), and you can browse through these and locate the ones you’d like to filter out from now on.
[^3-1-20]: Exclude bots in Admin > View Settings.
[^3-1-21]: Google have a guide on doing this: https://support.google.com/analytics/answer/1034840?hl=en
[^3-1-22]: Do this by creating a separate web property in Google Analytics that has its own unique profile ID. Next, configure your website software to send tracking data occurring in the test version of your website to this distinct profile ID.
[^3-1-23]: Admin > Tracking Info > Referral Exclusion List
[^3-1-26]: Not to mention that storing information in the URL doesn’t scale…what if there were 10 dimensions you wished to send to Analytics?
[^3-1-29]: Install on-site search monitoring by visiting Admin > View Settings > and then, within the box “Query parameters”, entering the name of the parameter that your website uses to store the keyword of the search query. For example, in the following URL, the parameter would be “keyword”: /search?keyword=philosophy
[^3-1-30]: View the report by visiting Reporting > Behavior > Site Search > Search Terms.
[^3-1-34]: It seems to me more logical to capture this distinction with the campaign medium dimension, but, alas, Google AdWords automatically tags both normal and remarketing campaigns as “cpc”, meaning this more sensible option is out.