Durch Anklicken „Akzeptieren“, stimmen Sie der Speicherung von Cookies auf Ihrem Gerät zu, um die Seitennavigation zu verbessern, die Nutzung der Website zu analysieren und unsere Marketingaktivitäten zu unterstützen. Sehen Sie sich unsere an Datenschutzrichtlinie für weitere Informationen.
Sep 4, 2024
Google Analytics

Google Analytics 4 (GA4) -Rohdaten in BigQuery (BQ)

Untersuchen Sie die ersten Besuche von Benutzern. Von Sanu Maharjan

After Google announced shutting down Universal Analytics (UA) a lot of users are bound to shift from UA to GA4. With this new change, it is essential to look at the data structure and how to extract important metrics from raw GA4. In this blog, I would like to introduce how to count event sessions, in a given calendar week, date, and hour in BigQuery.


We have already published an intro blog on how to query the data with static, dynamic, and hybrid dates. Please have a look at the blog from Vaishnavee to refresh your memory. With that in mind, let’s first look at the “Preview” of the data, before starting the query itself. The data is from a website that has been integrated with GA4 and the raw data is already imported to BQ.

Bildschirmfoto 2022-06-01 um 22.09.25.png

For this blog, let's focus on the following columns: event_date and event_name. Each row is created when a user interacts with your website or app. For example, first_visit is triggered when a new user visits the website or launches the app for the first time. Let's then look at the first visits to our website on different days and times. We are aiming for a table that looks like this:

Bildschirmfoto 2022-06-01 um 22.09.32.png

Assuming that, it is required to fetch data for this current running month-to-date (May 2022). The following query is valid:

SELECT *

FROM `PROJECT_NAME.analytics_PROPERTY_ID.events_*`

WHERE TRUE

AND _TABLE_SUFFIX BETWEEN '20220501' AND FORMAT_DATE('%Y%m%d', DATE_SUB( CURRENT_DATE(), INTERVAL 1 DAY))

Since we want to look at first_visits only, we will filter out data that has event_name = ‘first_visit’ and get the date from the event_date column. With the date column, it is possible to get the calendar week number. Here I have provided you with a special calendar week format that we use in datadice. If you have other approaches to format date to the responding week, let us know. Then we group it by calendar week and date and count the event_name. The following updated query will provide the result we want:

SELECT

CONCAT(FORMAT_DATE('%Y CW', PARSE_DATE('%Y%m%d', event_date)),

LPAD(CAST(CAST(FORMAT_DATE('%U', PARSE_DATE('%Y%m%d', event_date)) AS INT64)+1 AS STRING),2,'0')) AS calender_week,

PARSE_DATE('%Y%m%d', event_date) AS date,

COUNT(event_name) AS first_visits,

FROM `PROJECT_NAME.analytics_PROPERTY_ID.events_*`

WHERE TRUE

AND event_name = 'first_visit'

AND _TABLE_SUFFIX BETWEEN '20220501' AND FORMAT_DATE('%Y%m%d', DATE_SUB( CURRENT_DATE(), INTERVAL 1 DAY))

GROUP BY calendar_week, date

ORDER BY calendar_week, date

Now the remaining one is only the hour column, the time is stored in event_timestamp but in micros. In BQ, it is possible to change from micros to an hour-minute-second format by using TIMESTAMP_MICROS(event_timestamp).

Now using a CASE-WHEN statement, we will divide the hours into 4 groups: 00-06, 07-12, 13-18, and 19-23.

SELECT

CONCAT(FORMAT_DATE('%Y CW', PARSE_DATE('%Y%m%d', event_date)),

LPAD(CAST(CAST(FORMAT_DATE('%U', PARSE_DATE('%Y%m%d', event_date)) AS INT64)+1 AS STRING),2,'0')) AS calender_week,

PARSE_DATE('%Y%m%d', event_date) AS date,

CASE

  WHEN EXTRACT(HOUR FROM TIMESTAMP_MICROS(event_timestamp)) BETWEEN 0 AND 6                            

        THEN '00-06'

  WHEN EXTRACT(HOUR FROM TIMESTAMP_MICROS(event_timestamp)) BETWEEN 7 AND 12

 THEN '07-12'

 WHEN EXTRACT(HOUR FROM TIMESTAMP_MICROS(event_timestamp)) BETWEEN 13 AND 18

THEN '13-18'

 ELSE '19-23'

END AS hour,

COUNT(event_name) AS first_visits,

FROM `PROJECT_NAME.analytics_PROPERTY_ID.events_*`

WHERE TRUE

AND event_name = 'first_visit'

AND _TABLE_SUFFIX BETWEEN '20220501' AND FORMAT_DATE('%Y%m%d', DATE_SUB(

CURRENT_DATE(), INTERVAL 1 DAY))

GROUP BY calendar_week, date

ORDER BY calendar_week, date

With this query, you are able to count the number of first visits sub-grouped by calendar week, date, and hours.

Conclusion

This post is a part of the “Raw GA4 data in BigQuery” series to get some quick insights into the user behavior on your website or app. I will end this session here for now, but in the upcoming posts, I will try to provide more queries to analyze data from GA4 and build up more complex queries in the upcoming future.