Skip to main content
  1. Blog Post/

Google Analytics 4 (GA4) Events Demystified

15 min · 4489 words
#GA4
Table of Contents

At his point, many ( if not all ) have heard Google Analytics is moving to an "events" based tracking model with Google Analytics 4. But, what does it really imply? Do we have to worry about it?. To be honest, it's not a big ( from the implementation side ) deal since we have been already using "events" all the time, we used to call them hit types. If we look at it from the reporting side it may lead to some "hard times" when trying to use the data, not because it's better or worse, just because it's different.

This post will try to explain Google Analytics 4 Events from the technical perspective, trying to explain how to current event model works, where can the events come from, the limitations, etc.

I'd say that one of the most important things when working with GA4, is realizing how important is going to be the data model definition we do at the start. Because this is going to condition the future of our implementation and data.

But don't worry about this for now. we'll dig into this across the post ?.

How does Google Analytics 4 record the data

Google Analytics 4 works much similarly to Universal Analytics.

We'll be sending hits (network requests) to a specific endpoint ( https://endpoint.url/collect ). This shouldn't be anything new for anyone, that's how all analytics tools and pixels work. And this is the way it works for the client-side tracking (gtag.js), server-side tracking ( measurement protocol ), and the app tracking ( Firebase Analytics SDK ).

Tracking endpoints

I found there are 5 different endpoints that we could use to send the data to Google Analytics 4, these are:

    Depending on where we are doing the tracking we'll be using one of them.

    We could see hits flowing to 4 different endpoints for GA4 + 1 for Firebase

    The first two endpoints are the ones used by the client-side tracking but you may wonder why sometimes we see the hits coming through analytics.google.com, and some other times via the google-analytics.com domain. The reason is that if current GA4 property has "Enable Google signals data collection info" turned on, GA4 will use the *.google.com endpoint ( si Google would be able to use their cookies to identify the users, I guess )

    JavaScript Client Library

    The page tracking is done using a library provided by Google, the same way we used to have analytics.js , ga.js or urchin.js libraries in the past Google Analytics versions.

    The default code snippet will look like this:

    <!-- Global site tag (gtag.js) - Google Analytics -->
    <script async src="https://www.googletagmanager.com/gtag/js?id=G-THYNGSTER"></script>
    <script>
      window.dataLayer = window.dataLayer || [];
      function gtag(){dataLayer.push(arguments);}
      gtag('js', new Date());
    
      gtag('config', 'G-THYNGSTER');
    </script>
                                    

    If you have noticed it the snippet loads a JavaScript file from www.googletagmanager.com domain, and this is because all gtag.js snippets are in essence a predefined Google Tag Manager template. It's not just a plain GTM container, since it does some internal stuff, but it works also based on tags, triggers, and variables.

    Previous tracking libraries were offering a public API to perform all the tracking at our end, ie: it was accepting some methods/calls and converting them to hits, doing the cross-domain tracking allowing us to use Tasks, while at the same time doing some logic for generating the cookies, reading the browser details, and this library was shared across all the users worldwide web.

    This is no longer working this way, now each Data Stream / Measurement ID will have its own snippet and it will load a separate js file. We may look at this as a performance penalty but it's done this way for a reason.

    Each gtag.js container it's now built dynamically at Google's end and contains personalized code for the current property and also holds the settings for the current data Data Stream / Measurement ID. And that's why the container sizes are different for each container we check. Don't worry, this is normal and expected. The container size will vary depending on many things, like if we have the Enhanced measurement features we have enabled or the current settings we defined on the admin interface for our property.

    One thing that has been confusing me since Google Analytics 4 arrived, was thinking that there were lots of things happening on the back that were hardly possible to debug, like the conversions, or the created / modified events.

    And well, that's not the way it works, almost any setting or feature you enable on the admin it's going to be translated into code and will be executed on the client-side. This means that when you add a new event on the interface that's will add some code on the gtag.js container will send an event, and this will make that you "may" end seeing "ghost" events on the browser, don't waste your time as me trying to see why your implementation was firing duplicated events :). Or for example when we define a conversion event when we configure our internal domains or the ignored referrals.

    While this approach may help some people in doing some common tracking tasks, on the other side it's preventing to do some advanced implementation because some "loved" features like the "customTasks" are now missing. I'm ok with Google trying to control how things are done, but there will always be sites that will need custom /U personalized implementations, and I really feel that Google should provide some public/documented API methods to easily perform some of the most used common tasks like the cross-domain tracking in Google Analytics 4.

    Let's see some examples, when you "create a new event" from the Admin Interface, this event won't be created server-side, what' is happening is that GA4 will add some code logic to send that hit client-side.

    Another example would be when you enable the Enhanced Measurement, this will turn on having some code added to your container. Remember that we mentioned that GA4 was in essence a Google Tag Manager container?, if you take a look at the current Measuring categories you'll notice how they all match the current triggers available on GTM ( clicks tracking, scrolls tracking, youtube tracking )

    And that's not all, when we change the session duration or the engagement time, some session_timeout variables will be updated internally (engagementSeconds, sessionMinutes, sessionHours)

    We could keep going on examples, or build a full list, but that's likely going to get outdates sooner than later. The main idea you need to get from this part of the post is that GTAG is like a "predefined" GTM template and that all the tracking happens on the client's browser.

    Firebase Analytics SDK

    Apps are usually tracked using the Firebase Analytics SDK . A good starting point would be visiting the following Url: https://firebase.google.com/docs/analytics/get-started?platform=android&hl=en

    The apps hits will use their own endpoint and format, the hits will go to https://app-measurement.com and the current payload will be sent in binary format, which makes it really difficult to debug, event if using Charles, Fiddles, or any other MITM proxy app.

    If you want to debug your Firebase implementation. I recommend you use my Android Debugger for Windows. Once you install the app, you'll be able to request a free lifetime license.

    Google Analytics 4 Measurement Protocol

    Google Analytics finally offers a proper Measurement "Protocol", which is at the time of writing this post it's in Beta stage.

    This protocol will use the https://www.google-analytics.com/mp/collect endpoint, and rather than having the developers build the request payloads using some non-intuitive keys, now it accepts a POST request with a JSON string attached to the body using application/json Content-Type:

    fetch('https://www.google-analytics.com/mp/collect?measurement_id=G-THYNGSTER&api_secret=12zneF6DSDFSDFjJPgDAzzQ', {
      method: "POST",
      headers: {
         'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        "client_id": "12345678.87654321",
        "user_id": "RandomUserIdHash",
        "events": [{
          "name": "follow_me_at_twitter",
          "params": {
            "twitter_handle": "@thyng",
            "value": 7.77,
        },{
          "name": "follow_intent",
          "params": {
            "status": "success"
        }]
      })
    });
                                    
    KeyType
    client_idstrRequired. 
    user_idstrOptional.
    timestamp_microsintOptional. Hit offset. Up to 3 days ( 2,592e+11 microseconds ) before the current property's defined timezone.
    user_properties{}Optional.
    non_personalized_adsboolOptional. ( whatever use this event for ads personalization )
    event[][]Required. ( Max 25 Events per request )
    event[].namestrRequired. 
    events[].params{}Optional.

    In any case, there are some things you need to have in mind, you should keep your API Secret not exposed, meaning that this endpoint should not be used client-side, because that would mean that your API Secret would need to be exposed. This endpoint is more likely to be used to track offline interactions, ( like refunds ), or for tracking our transactions server-side.

    At the time of writing this post ( Apr 2022 ), one of the biggest handicaps of this protocol is that it doesn't support any sessionId parameter, meaning that you won't be able to stitch the current server-side hits to the client-side session. This should be fixed over the next months,

    In the meanwhile, I've published a the GA4 Payload Parameters CheatSheet, which you could use to send some server-side hits in the old-school way ( like we used to do with the first Measurement Protocol for Universal Analytics ) and where you could attach the "&sid" parameter.

    There are of course some other points to have in mind, like that GA4 has some reserved event and parameters names, that you should not be using. We'll cover this later in the "events" section.

    Events Model / Hit Types

    Let's start by saying that everything on Google Analytics 4 is an "event". I'm sure that it's not the first time you hear that, and it's totally right, but at the same time if we strictly look to Universal Analytics we were also sending "events", but then we used to call them "hit types".

    In a technical meaning, nothing has changed at all. We have networks requests to some endpoints. That is it!. If you want to learn a bit more about how the hits are built or sent from the web tracking library you can take a look at GA4: Google Analytics Measurement Protocol version 2 post to learn a bit more about how it works.

    The main difference on GA4 is that now Google does not offer a fixed tracking data model besides the page_views and the e-commerce. Meaning that the responsibility for building a proper data model falls on us. While working on our definition we need to have in mind that there are some predefined/reserved event and parameters names and that we have some limits we need to have in count (About total events, names, and values lengths).

    Universal Analytics Hit Types Model

    If we take a closer look, since Urchin times we've been using "events" for our tracking in Google Analytics. Yep, I'm not joking, we had, we just called them "hit types".

    Just so you know, we could replicate the current Universal Analytics Data Model in Google Analytics 4 following the next table of events:

    Hit Type / EventParameters
    pageview- Location
    - Path
    - Title
    event- Category
    - Action
    - Label
    - Value
    - Non Interaction
    timing- Category
    - Variable
    - Label
    - Value
    social- Network
    - Action
    - Opt. Target
    exception- Description
    - Fatal
    screenview- Screen Name
    transaction ( Legacy Ecommerce )- Id
    - Affiliation
    - Revenue
    - Tax
    - Shipping
    - Coupon
    item ( Legacy Ecommerce )- Id
    - Name
    - Brand
    - Category
    - Variant
    - Price
    - Quantity

    Even Google offers a setting that will automatically convert all your ga() calls to some predefined events on GA4. From your Data Stream configuration you can enable this feature and all events, timing, and exception events will be converted to GA4 events ( they will add a listener to the ga('sent', 'event|exception|timing') calls for doing this,

    This tool wil map the data in the following way:

    Event NameParameters
    [event_name]This will take the current eventAction
    eventCategory > event_category 
    eventAction > event
    eventLabel > event_label
    eventValue > value
    timing_completetimingCategory > event_category
    timingLabel > event_label
    timingValue > value
    timingVar > name
    exceptionexDescription > description 
    exFatal > fatal 

    Beware because since its converting all Event Actions on "events", depending on your current de events definition on Universal Analytics you have end up hitting the unique event names limit (500)

    Google Analytics 4 Events

    Event Sources

    The events on Google Analytics 4 can come from 4 different sources. These are:

      Public Web Endpoint

      The main actual origin for GA4 events we've already talked about them. These are the event that is being generated on our site coming from the GTAG.js container ( Check the GA4 Payload Parameters CheatSheet here ).

      Measurement Protocol ( Server Side )

      Another source for our events is the measurement protocol. This works similarly to the public endpoint. but the hits would be sent via server-side and we'll need to use an API Secret within our requests.

      Internal self-generated Events

      This one can be a bit confusing, GA4 auto-generates some of the events we see in the reports. This means that we see some events in our reports that won't be seen in our browser.

      This doesn't mean that they're being generated randomly or using some server-side logic. Most ( if not all ) of these events are created because a parameter was added to some event.

      Our events payloads may have some extra parameters attached to them sometimes that will make GA4 internally spawn a separate event. As far as I've been able to identify this is the list of the internally generated events and what's the parameter that will trigger them.

      Event NameTrigger
      session_start&_ss
      first_visit&_fv
      user_engagement&seg

      For example, if the current event payload contains a &_ss parameter, a session_start will be generated, if it contains a $_fv then we should be able to see a first_visit events and so on. This list may grow in the future (and it may be missing some events that I've not been able to spot yet)

      If we've enabled the Enhanced Measurement, we may also see some events in our reports ( this time this event will be visible without the browser requests ), these are:

      Event NameParameters
      clicklink_id
      link_classes
      link_url
      link_domain
      outbound
      file_downloadlink_id
      link_text
      link_url
      file_name
      file_extension
      video_play
      video_pause
      video_seek
      video_buffering
      video_progress
      video_complete
      video_url
      video_title
      video_provider
      video_current_time
      video_duration
      video_percent
      visible
      view_search_resultssearch_term
      scrollpercent_scrolled
      page_viewpage_referrer ( URL and Title are Shared Parameters )


      On the other side, when working with the Firebase Analytics SDK, this one will automatically track a lot of events, without us needing to explicitly define them.

      Here is the current list of autogenerated event names by Firebase:

      ad_activeviewAPP
      ad_clickAPP
      ad_exposureAPP
      ad_impressionAPP
      ad_queryAPP
      adunit_exposureAPP
      app_clear_dataAPP
      app_installAPP
      app_updateAPP
      app_removeAPP
      errorAPP
      first_openAPP
      in_app_purchaseAPP
      notification_dismissAPP
      notification_foregroundAPP
      notification_openAPP
      notification_receiveAPP
      os_updateAPP
      screen_viewAPP
      user_engagementAPP,
      Note: These events will not count towards the unique events name limit
                                      

      Admin defined events

      We've already talked about these ones, when we create or modify an event within the admin section, these settings will be translated to the client-side tracking.

      This means the following:

        Events Limitations

        Google Analytics 4 is full of limitations in many aspects, and it makes it a bit difficult to understand all of them, even more, when the limits keep constantly changing.

        We have limits for event names and values length, same for the event parameters and the user properties. At the same time, we have a limit on how many parameters and properties we can append to each event. And these limits may vary between the free and 360 versions.

        There are also, some exporting limitations (The free version it's capped to 1M daily hit export to Big Query ) or the data retention settings wherein the free version will top at 14 months while the 360 will allow to hold up to 50 months on data.

        But this is not all the limits we'll have ... we will also have limits for the total conversions, audiences, insights, and funnels we can set. This is not directly related to the events, so if you're interested you can visit the official Configuration Limits Information.

        Collecting and Names Limitations

        We can attach up to 25 event parameters ( 100 on GA4 360 ) to each event, and we can identify these values in our hits easily these are the ones starting with "^ep(|n).*". Event Parameters are meant to add some metadata to our events.

        ep.event_origin: gtag
                                        

        Each of these parameters should have a name no longer than 40 characters and a value not bigger than 100 characters.

        At the same type, we have the "user properties", We can attach up to 25 user properties to each hit these are attributes that will describe segments for our users. For example, we could think about recording the current user newsletter sign-up status, or the total purchases made by the current user. We can identify his data in our hits because they will start with "^up(|n).*",

        up.newsletter_opt_in: yes
        upn.user_total_purchases: 43
                                        

        Each of these properties should have a name no longer than 24 characters and a value not bigger than 36 characters.

        Logged itemLimitFree360
        EventsEvent Name 40 chars
        Event parameter Name40 chars
        Event parameter Value100 chars
        Params per event25100
        User propertiesTotal per Property25
        Property Name24 chars
        Property Value36 chars
        User-ID256 characters
        Custom dimensionsEvent Scope50125
        Item Scope10
        User Scope25100
        Custom MetricsEvent Scope50125
        Events Offset3 days
        Full Limits Table

        Event Values Typing

        You may have noticed that some of the parameters start may start with up, ep, upn, epn . This is because an event parameter/user property can be either a string or a number, the good news is that we don't need to define them since they're automatically typed by GA4. Just take a look at the logic it's used to define if a parameter is a string or a number.

        var value = 'something';
        if(typeof(value) === "number" && !isNaN(value)){
            console.log("is a number parameter");   
        }else{
            console.log("is a string parameter");
        }
                                        

        SGTM - Google Analytics 4 Hits

        The last thing I want to shout out is that GA4 hits sent via Server Side Google Tag Manager, are able of doing two things that we won't see on the regular hits.

        First of these is that the hits sent server-side are able to set first-party cookies on the user browser, this is achieved using a Cookie-set header to the request:

        And the last one is that they may contain a response body, this is used to send back some pixels client-side. ie: SGTM builds up a pixel request and gets it back to the browser so it gets sent if for example, it was missing some third party cookie value (where sending it via server-side won't be making any difference )

        More Questions

        How can I identify a conversion?

        If the current event has a &_c=1 parameter it will be counted as a conversion

        Are there any e-commerce limits?

        Yes, they're, as far I've been able to deduct from the code.

          It takes some seconds to see my hits

          Google Analytics 4, can delay up to 5 seconds the hits firing. This is because it uses an internal queue in order to batch the event and save some hits requests. At this time there is no way to "force" the queue dispatch, and there're some situations where the queue is skipped and the events are sent right way. This is for example the first a visitor comes to your site (ie: when there's no cookie present).

          Why can't I use any of my parameters on the reports?

          You can send ANY parameters along with your events, but this doesn't mean that you'll be able to use them on your exploring reports. This can be confusing because while you'll see the parameters on the Real-Time reports, you'll need to set up them as dimensions on the admin in order to be able to use them. If you think about it, it makes sense, the real-time report is just some streaming report where no data is being parsed/processed at all, and we can not expect GA4 to process all the data coming with the events, so it will only process the parameters that we've configured. We need to setup then in the Custom Definitions section

          I've set-up my dimensions, but they show no data

          I'm not if this is only me, but it drove me crazy sometimes. I'd say that if you add a new event with some parameters and then you directly go to adding in the admin, they won't show up, but you'll be able to type the parameter name manually. All times I did this, I was not getting info for that dimension. My advice is to wait some hours before the custom definition and only do it if the dimension is being shown for being selected. ( rather than manually typing it ). If you did it wrong, the only solution that worked for me was archiving the dimension and re-creating it.