Integrating Google Optimize with Google Tag Manager

You may need to know if the current page has any Google Optimize Experiment running, to track that info on any tool, or to fire some vendor tag based on the current experiments statuses.

The following snippet will take care of sending a dataLayer push if there’s any active experiment running, including:

  • The Experiment ID
  • The Optimize Container ID where the experiment is running on
  • The current experiment variation being shown to the current user/device
(function() {
    for (var gtm in window.google_tag_manager) {
        if (gtm.match(/^GTM/)) {
            if (google_tag_manager[gtm].experiment) {
                    'event': 'optimize-experiment-active',
                    'optimize-container-id': gtm,
                    'optimize-exp-name': google_tag_manager[gtm].experiment.split("$")[0],
                    'optimize-exp-variation': 'Variation: ' + google_tag_manager[gtm].experiment.split("$")[1]

Then just create the needed dataLayer type variables to read the pushed data and use it on your tags/triggers.

The dataLayer push will look like:

    event: "optimize-experiment-active", 
    optimize-container-id: "GTM-XXXXXX", 
    optimize-exp-name: "GTM-XXXXXX_OPT-YYYYY",
    optimize-exp-variation: "Variation: 1"


SEO meets GA: Tracking search bots visits within measurement protocol

I’ve been attending lately (and having) to some talks about the logs parsing from the SEO perspective, (from @David Sottimano on Untagged Conference and Lino Uruñuela during some dinner time), and I’ve decided to publish a WordPress plugin that I started to work on some years ago, and that for work reasons I had it left on my “I’ll do it” drawer and it never came back to my mind.

First thing I need to the point to, is that this is a BETA PLUGIN, so please careful of using it on a high load trafic or on a production site. I’ve running on this site for 4 days without any problems, but that doesn’t mean it’s free of bugs. Let’s consider this plugin for now as a proof of concept.

The main task of the plugin is to register the search bots visits to our wordpress site into Google Analytics, using the Measurement Protocol.

The working flow of the plugin is easy, it just checks if the current visiting User Agent is matching any known Crawler, and based on that info we’re sending a pageview to some Google Analytics Property. Please take in mind that it’s recommended to use a new property since, we’re going to use a lot of custom dimensions to track some extra info beside the visited pages =)

I used to had my own User Agents parser, but I ended using another well stablished (and for sure more reliable) library. When something works there’s no need to reinvent the wheel :). So this pluggin uses the PHP library for the uap-core project.

Let’s see a simple flow chart about what the plugin does:

I’m sure this was easy enough to understand. But don’t only want to check what pageviews were visited by a search bot, no we’re going further and we’ll be tracking the following:

  • The Bot Name ( Googlebot, Bingbot)
  • The Bot Version (Desktop, Smartphone, Feature Phone)
  • The Response Code Status (200,404)
  • The page generation Time (In ms)
  • Total Memory used to render the HTML (in MB)
  • Total Queries needed to return the HTML to the bot (an integer with total mySQL queries needed).
  • An UserID for the bot (this is based on the IP Long value for the current Bot ).
  • A clientID (An UUIDv4 strnig based on the bot IP address, that will allow us to check how often that same bot returns to our site, and that will allow us the track the specific pages being crawled by a specific bot for each session).
  • The real Bot user agent, in order to debug and improve our detection engine.So know we’ll be able to answer the following questions:
  • Which bots visits my content
  • Which content was viewed by each different bot
  • When was this content crawled for first time
  • What 404 pages are being craweler by which search bots
  • How often is GoogleBot or any other search bot is visiting my domain or an specific content
  • How many different bots (ip addresses) had visited my site, and how often they come back to the site
  • Which pages did each bot crawled on each session

And for sure you may find replies to a lot of more questions, since we’re using Google Analytics to track those visits, we’ll able to cross any of the dimensions at our needs.

Another cool thing of tracking the bots crawls within the Measurement protocol, is that we’ll be able to watch how our site is being crawled in the real time reports! 🙂


You’ll just need to download the plugin zip file from the following url, and drop it in your WordPress Plugins folder and configure the Google Analytics Property ID to where you want to send your data.

Used Custom Dimensions

You may be wondering why do we have the same bot info related dimensions duplicated and with a different scope, this is why because as I explained before we’re using the bot IP address to build up a clientID and an userID, and it may happen that Google uses the same ip for different bots (like for Desktop or Featured Phone). This way we can have the hit level info too in the case that user scope data get’s overriden 🙂

Another thing we may want to do, is to setup the session timeout limit to 4 hours within our profile configuration. Bots Crawls are not done the same wht as an user navigates the page, and we may be getting 2 pages hits per hour, so the default 30 minutes timeout makes not sense at all.

Let’s know see how the reports will look on Google Analytics 🙂

Consumed content by bots with an hourly breakdown


Total sessions and pageviews by search bot


Pages that returned an 404 and which bot was crawling it

Which pages did a certain bot crawled (User Explorer Report)


You can get the plugin from the following GitHub repository:

If you are unable to run the plugin, please drop me a comment on this post or open an issue on GitHub and I’ll try to take a look to it.

Any suggestions/improvement will be very welcome too 🙂

Cross-Domain tracking with clean urls

I’ve been told by a lot of clients that the way that Google Analytics cross-domain tracking works is “ugly”, referring to having the linker param attached to the URL.

I must admit is not elegant having all that long hash on the url, thougt it won’t affect the page functionality. In the other side there isn’t any other to pass the current client Id from the Universal Analytics cookie to the destination domain without dealing with server-side hacks (we can’t not read POST data in JS ,yet).

Browsers have the History API . Which holds the current user navigation history,allows us to manipulate it and is widely supported by browsers:

history api support by browser

If you ever dealed with an Ajax based website, I’m sure you have noticied that even if the page does not reload, the url gets changed.

The history API does allow us to play with the current user session history, for example:


The above line will return the number of elements in the session history, if you have browse 4 pages in the current it’ll return 4.


Will return the user back to the previous page in the session.

But we’re going to focus on the pushState and replaceState methods. Those ones will allow us to add a new entry to the history record and will allow us to change the current page pathname without needing to reload the page.

I bet you’re guessing that we’re going to strip out the _ga parameter with those functions and you’re right. This won’t be harmful for the crossdomain tracking since we’re going to do it after the Google Analytics object has been created so it won’t affect our implementation but we’ll end showing the user a cleaned up URL after Google Analytics does all it’s cross-domain tracking magic.

We’ll using the “replaceState” in this example, to avoid users clicking on back button to be sent to the same. This method will just change the URL but WON’T add a new entry to the session history.

To achive this hack, we’ll be using the hitCallback for our Pageview Tag on Google Tag Manager.

In first place, we are going to need a variable that is going to take care of reading the current URL, cleaning it up, and manipulating the browsers URL using the History API.

I’m calling it “remove _ga from url pushState” , feel free to name it at your convenience:

  return function(){
      if ([^&]*)/)) {
          var new_url;
          var rebuilt_querystring;
          // A small function to check if an object is empty
          var isEmptyObject = function(obj) {
              var name;
              for (name in obj) {
                  return false;
              return true;
          // Let's build an object with a key-value pairs from the current URL
          var qsobject =^\?)/, '').split("&").map(function(n) {
              return n = n.split("="),
              this[n[0]] = n[1],
          // Remove the _ga parameter
          delete qsobject['_ga'];
          // Let's rebuilt the querysting from the previous object with the _ga parameter removed
          var rebuilt_querystring = Object.keys(qsobject).map(function(k) {
              if (!qsobject[k]) {
                  return encodeURIComponent(k);
              } else {
                  return encodeURIComponent(k) + '=' + (encodeURIComponent(qsobject[k] || ""));
          // We want to if the current querystring was null
          if (isEmptyObject(qsobject)) {
              new_url = location.pathname + location.hash;
          } else {
              new_url = location.pathname + '?' + rebuilt_querystring + location.hash;
          // Use replace State to update the current page URL
          window.history.replaceState({}, document.title, new_url);

Now we only need to add this new variable as the hitCallBack value for our pageview tag:

So this is what is going to happen now:

1. Google Analytics Object will be created
2. It will process the linker parameter, overriding the current landing domain clientId value as long as the linkerParam value is legit
3. After that the current page URL will be changed for the same URL but with the _ga parameters stripped out.

Bringing back utm_nooverride functionality to Universal Analytics

Universal Analytics removed the utm_nooverride=1 functionality, still we can define a list domain referrals to be treated as direct visits within our properties configuration section, but what about when we can’t control the source domains?, for example for emailings, or some display campaign that we don’t want to override our users original attribution?.

We’re going to use Google Tag Manager, so bring back this functionality to our implementations.

First we need a Variable to read if is there a querystring parameter named utm_nooverride and that it’s value.

Ok, this variable will hold the value “1” when the utm_nooverride parameter is present. Now we’re going to use it to force the “dr” (document referrer) parameter just under that situation.

For that we’re going to need an extra Custom JavaScript variable with the following code on it:

Let’s be lazy!, you can copy this little piece of code below:

  if({{QS - utm_nooverride}}=="1"){
      return document.location.origin;
      return document.referrer;

We’re almost set, now we want to force our pageview tag to use this last created variable for the “referrer” field.

We’re done!, now if the utm_nooverride parameter is present on the landing page, Google Tag Manager will send the current domain name as a referrer, forcing that new visit to be threated as direct traffic.

UPDATE: I don’t recall if the override had preference over campaign parameters, if you know about it, please drop a comment :). Else I’ll be checking it on the next days.


How to keep your returning user’s legacy data when switching domain name

When we’re switching a site domain name we always have in mind some basic steps to take in mind so the migration doesn’t end being a mess. One of those steps is usually 301-ing our old domain content to the new one, but we never think on how will this affect our current Google Analytics data.

Universal Analytics cookie is based on the domain hostname, so if we switch the current domain a new cookie set will be created along with a new client ID, forcing that all the visits we redirect will end being new visitors. This mean we’ll be losing ALL our previous attributions/history data for returning visitors., doh!

This time, we’ll try to mitigate this problem using Google Tag Manager and some Mod Rewrite (htaccess) magic.

We’ll be using Apache’s Rewrite module to read the current user “_ga” cookie and passing it along our redirection, then from GTM we’ll force the clientId within our tracker in order to keep our old users clientId for our new domain 🙂

Below you can find our .htaccess. As you can see we check for _ga cookie value, and then we redirect the user to the new domain with a new parameters named “_mga” , that is going to hold the _ga cookie value and the timestamp.

RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_COOKIE} _ga=([^;]+) [NC]
RewriteRule ^(.*)$$1?__mga=%1\.%{TIME} [R=301,QSA,L]
RewriteRule ^(.*)$$1 [R=301,QSA,L]

You may be asking yourself about why we are adding the current timestamp (%{TIME}) as a parameter value. The reason is pretty simple, we don’t want someone sharing that url to someone else and end having a lot of users sharing the same clientId, do we?

We’ll use that value later on Google Tag Manager to check is the redirection was generated less than 120 seconds ago if not we’ll just return any value. This is how native Universal cross-domain feature works too!

if ({{__mga.timestamp}}) > 120)

If _mga.timestamp and current user timestamp values substract is higher than “120”,  it means it was generated more than 2 minutes ago, so we don’t want to push any clientId back on this case.

The current format for the %{{TIME}} value from mod rewrite is the following:


And it will likely be using the UTC timezone. This is important since the check will be made client-side, and we’re gonna need to check the current user time in UTC time, not the current client timezone.

GTM Configuration

On the Google Tag Manager side, we’ll need one variable that will take care of grabing the _mga value,  and from there we’ll get the clientId and the link generation time.

Then we’ll be checking the current user browser’s UTC timestamp to see if this link was generated less than 120 second ago, to know if we should be returning any value.

Grab the variable code bellow:

    // Let's grab our custom linker value from QS
    var _mga_linker_value =[^&]*)/)[1].split('.').pop();

    // Let's convert the YYYYMMMDDHHMMSS date to timestamp format
    var _mga_date = new Date(_mga_linker_value.slice(0, 4), _mga_linker_value.slice(4, 6) - 1, _mga_linker_value.slice(6, 8), _mga_linker_value.slice(8, 10), _mga_linker_value.slice(10, 12), _mga_linker_value.slice(12, 14));

    // Let's add the current browser timezone offset
    var _mga_timestamp_utc = Math.round(_mga_date*1/1000)-new Date().getTimezoneOffset()*60;

    // This is the current browser UTC time
    var _browser_timestamp_utc = new Date()*1;

    // This is going to be the total seconds diff, between linker creation time and current user's browser time
    var _linking_offset_in_sec = Math.round(_browser_timestamp_utc/1000 - _mga_timestamp_utc);

    // Let's force the clientId value ONLY if the time difference is less than 2 minutes

Now we only need to use the returned value by this variables as the “clientId” value on our tracker this way:

Of this this may not be only applied for Google Analytics but for any other cookie value you want to keep, just modify the code to grab any other cookie value you may need

Google Tag Manager event tracking using data attribute elements

On the last #tip we talked about how to debug/qa our data attributes , and now we’re going to learn about how to natively track events/social interactions within Google Tag Manager .

We’re going to learn it, basing our tracking on Google Analytics Events and Social Interactions. Of course this can be expanded to any other tool just changing the data attributes, but hey, this is about to learning not about give me a copy and paste solution.

Let’s start saying that data-* attributes it’s and standard HTML5 mark up that we can use to manage our page functionality based on that data instead of relaying on classes or id.
A data attribute is intended to store values that are mean to the page or application and that doesn’t fit in any other appropiate attributes.

In our care the data that we’re storing is the hitype that we’ll be firing. In our example it could an “event” or a “social interaction” . For this we’re setting a data attribute named “wa-hittype“, and this attribut will hold the current hit to be fired, in our case “event” or “social”.

We’ll be using some other data attributes to define our events category, action, label, value and non-interactional switch, please take a look to the following table for more details:

Data Attr Description
data-wa-hittype Type of hit we want to fire on user's click
data-wa-event-category The Category value for our event
data-wa-event-action The action value for our event
data-wa-event-label *optional. The label for our even
data-wa-event-value *optional. The value for our event if any
data-wa-event-nonint *option. Is the event non interactional?

Let’s check an example:

 data-wa-event-action="Add To Cart" 
>Add To Cart<a/>

So we have a data attribute that will allow us to know when fire a tab based on a CSS selector, and we’ve too all the info needed to populate the information for our event.

Next step is to configure some variables to read these values when the user clicks on the element.

So now when the user clicks on some element, we’ll have all our event needed data on those new variables. Let’s work on the trigger that will make our tag to fire.

We’re using the In-build {{Click Element}} Variable and some magic with a CSS Selector.

There we’re, now we just need to setup our event tag, add our variables to the tag fields, and set the trigger on this new event tag.

Now everytime you need to track a new click on some page element, you’ll just need to ask the developers to add some “standard” data mark-up to the right element.  Even if you do something wrong, the variables will take care of fixing the values were possible (like an event value expecting an integer value instead of a string) or setting a right boolean value for the non-interactional switch for the event.

Any suggestion or improvement to this tracking method is welcome 🙂

P.D. Yeah! I know I talked about tracking Social Interactions too, but I’m pretty sure that you’ll be able to figure it out. Think about like a good moment to learn how to do things instead of just trying to copy and paste and hoping it will work.

GAUPET Release: Google Analytics User Permissions Explorer Tool

Some months ago I asked some friends to test a new tool I was working on and past week I released something close to an open alpha, today after pulling some details, a new UI redesign 100% mobile compatible. I’m announcing the GAUPET release.

At first I named it as GA Governance Tool, but after some interesting chat with the “osom” Yehoshua Coren . I(we)’ve decided to change the tool’s name to something that it’s closer to what it is and here is it: GAUPET , which stands for Google Analytics User Permissions Explorer Tool. (yep, you’re right I didn’t rack my brain on this one)

You can find It the the following link : GAUPET

This will allow you to easily manage and pivot all your Google Analytics users and permissions in order to have a clear view of your current accounts User Governance status.

GAUPET will allow you to gather all your account user emails and permissions and draw them into an interactive pivot table. Even will allow you to merge different accounts users within the same report (thanks goes to Peter O’neill for this and another nice suggestions that will come in a future).

The tool comes with some predefined reports, but you will be able to pivot any data in the way you need. Just drag and drop the fields that’s it!.

The included fields are:

  • Email Address
  • Email Domain
  • Access Level
  • Account ID
  • Account Name
  • Account Access Rights
  • Account Permissions
  • Property ID
  • Property Name
  • Property Access Rights
  • Property Permissions
  • View ID
  • View Name
  • View Access Rights
  • View PermissionsLet’s take a look to a sample the report for user’s with view access:

    I’m offering this tool for free, and I’m hosting it for free, and this means that it’s offered “as it is”. Still you’ll have a feedback section on the page to report bugs, or ask for new features and I’ll try to make updates in my free time.

    Extra thanks fly to Damion Brown , Ani Lopez , Simo Ahava , Natzir Turrado , Doug Hall and Brian Clifton for their comments and testing. #tip Each of them worth a follow 🙂

#Tip – How to quickly debug/qa data attributes

With the years I learned that using CSS selectors to track user actions is really great but sadly I learned too that it’s really dangerous too.

It’s true that we won’t need to ask the IT team to add some dataLayer or ga pushes into the page, and therefore saving a lot of precious time, but in the other side, any single page update or testing will break our tracking.

Now I try to use data attributes whereas is possible, since those are more likely going to be kept for layout updates.

Checking elements for data attributes can be a tedious task, so I’m going to show you a little piece of code that I hope will make your life easier if you based some of your implementations on data attributes.

On this little snippet is where the magic happens:

(function() {
    var elements = []'*')).filter(function(el) {
        if (typeof (el.dataset) != "undefined")
            return Object.keys(el.dataset).length != 0;
    var data = [];
    var i = elements.length;
    while (i--) {
        var el = JSON.parse(JSON.stringify(elements[i].dataset));
        el["_element_type"] = elements[i].nodeName;


As an example I’m going to show you the output for Google Tag Manager‘s Homepage.

This has been a great time saver for me. Hope you find it useful too 🙂

Universal Analytics Plugin Online Hackathon – Dual tracking

I’ve been thinking about doing a Google Analytics related hackaton for a long time. Some months ago, I started to take a look about how Universal Analytics Plugins work and I decided that coding a plugin to all the data to a secondary property using just a plugin would be a real nice example.

For years now, I’ve sharing a lot of code that I’ve worked on, some tracking ideas too, but still I don’t consider myself a developer, if i must say it, I really think that I really suck at programming even if I can do some stuff myself.

So here I am trying to organize an online Universal Analytics Hackaton. I hope this can turn on a great change to learn from other people, and understand how plugins work!!!

Of course you may be asking what’s a “Hackathon” (don’t be shy about asking). Let’s quote the Wikipedia:

A hackathon (also known as a hack day, hackfest or codefest) is an event in which computer programmers and others involved in software development and hardware development, including graphic designers, interface designers and project managers, collaborate intensively on software projects. Occasionally, there is a hardware component as well. Hackathons typically last between a day and a week. Some hackathons are intended simply for educational or social purposes, although in many cases the goal is to create usable software. Hackathons tend to have a specific focus, which can include the programming language used, the operating system, an application, an API, or the subject and the demographic group of the programmers. In other cases, there is no restriction on the type of software being created.

GitHub Repository:

For now I’ve pushed to the repository  with some “core” code, that “already” works.

How to load the plugin:

ga('create', 'UA-286304-123', 'auto');
ga('require', 'dualtracking', '', {
    property: 'UA-123123123213-11',
    debug: true,
    transport: 'image'
ga('send', 'pageview');

Some stuff you need to take in mind when loading a plugin in Google Analytics:

  • The plugin needs to be hosted within your domain
  • It needs to be “initialized” AFTER the “create” method call and BEFORE the “pageview” method.
  • If for some reason the plugin crashes it may affect your data collection, please don’t use this in production before it has been fully tested.

Still it needs to be improved, for example:

  1. We don’t want to use global variables
  2. Payload size check, and based on the results send a POST or GET request
  3. Add XHR transport method
  4. Code cleanup/Best practises
  5. Plugin option to send a local copy for the hits
  6. Better debug messages
  7. Name convention improvement
  8. Any other idea?

Anyone is welcome to push code, add ideas, give testing feedback, through the Github repository or the comments on this blog post.





Getting super clean content reports in Google Analytics using GTM

In Google Analytics the urls are case sensitive, therefore in our content reports /index.html will be different to /Index.html, and querystring values will make Google Analytics to think that even if it’s the same page it will recorded as a new one, /index.html?cache=off and /index.html?cache=on will be recorded as 2 different pageviews for example.

The first problem its easily fixable with a lowercase filter within the views, but the querystring parameters it’s going to be a problem … I bet you’re saying that you can just add them to the Exclude URL Query Parameters list within your view configuration page and Yes! that’s right, but I’m pretty sure that you’re likely going to end having some marketing campaigns adding new parameters, or IT adding some parameters to get some funcionality switched on (like enabling some caching feature or whatever).

So today, we’ll be using Google Tag Manager to solve this problem of having all our content reports fragmented due the unexpected querystring parameters in our pages. So let’s think about it, wouldnt be easier to identify the real parameters and getting ride of the rest that are not expected for the page functionality?, If you think about it, it’s likely a better way to do it, we can know which parameters will be used in our site, but we cannot think on unexpected ones.

To achive this, we’re going to make use of just one single variable in Google Tag Manager, yeah that’s it, just one single Custom Javascript variable.

We’ll just need to configure the paramList array on the code top, and add there all the querystring parameters that we want to keep. Any other parameter that is not listed in our array will be removed from the querystring value that is going to be recorded by Google Analytics

        // We'll need to defined the QS values we want to keep in our reports         
        var paramsList = ["two","one","three"];

        // CrossBrowser inArray polyfill 
        if (!Array.prototype.indexOf) {  
            Array.prototype.indexOf = function (searchElement /*, fromIndex */ ) {  
                "use strict";  
                if (this == null) {  
                    throw new TypeError();  
                var t = Object(this);  
                var len = t.length >>> 0;  
                if (len === 0) {  
                    return -1;  
                var n = 0;  
                if (arguments.length > 0) {  
                    n = Number(arguments[1]);  
                    if (n != n) { // shortcut for verifying if it's NaN  
                        n = 0;  
                    } else if (n != 0 && n != Infinity && n != -Infinity) {  
                        n = (n > 0 || -1) * Math.floor(Math.abs(n));  
                if (n >= len) {  
                    return -1;  
                var k = n >= 0 ? n : Math.max(len - Math.abs(n), 0);  
                for (; k < len; k++) {  
                    if (k in t && t[k] === searchElement) {  
                        return k;  
                return -1;  
        var qsParamsSanitizer= function(qs,permitted_parameters){
        var pairs = qs.slice(1).split('&');
        var result = {};
        pairs.forEach(function(pair) {
            pair = pair.split('=');
            result[pair[0]] = decodeURIComponent(pair[1] || '');

        var qsParamsObject = JSON.parse(JSON.stringify(result));
        for (var p in qsParamsObject){
                delete qsParamsObject[p];
        var rw_qs = '?' + 
                Object.keys(qsParamsObject).map(function(key) {
                    return encodeURIComponent(key) + '=' +
        if(rw_qs=="?") rw_qs="";
        return rw_qs;
     return qsParamsSanitizer(,paramsList);
       // let's let GA to use the current location.href if
       // for some reason our code fails.
       return undefined;

Now, we only need to set our pageview tag “page” parameter so Google Analytics uses the new sanitized array instead of the one that it’s on the url.

We’re done!. Let’s see how it works with a screenshot

Now you just need to sit down, and wait some hours to start seeing your reports in a clean way and with no fragmentation. Happy analyzing!