Integration Specifications
This article will help you learn about how Daasity replicates data from GA4 BigQuery, limitations to the data we can extract and where the data is stored in the GA4 BigQuery schema.
Integration Overview
Google Analytics V4 is the newest version of Google Analytics, wherein raw event data from Google Analytics V4 properties can be exported to BigQuery and then fetched through a series of SQL queries.
This document provides context on what kind of data is being gathered through this extractor, which endpoints that data is coming from, and how the extracted tables relate to each other.
Integration Availability
This integration is available for:
Enterprise
Growth
BigQuery Export
The Daasity GA4 BigQuery extractor is built based on this GA4 BigQuery Export Documentation. Staging tables are extracted from BigQuery Exports via SQL and are then stitched together during the transformation process to create the standard GA report tables. The Daasity GA4 BigQuery extractor fetches reports in the following time periods:
Full day (tables ending in _stg1)
Intraday (tables ending in _stg2)
Hourly
NOTE:
Daasity includes a cutover date for merchants that have both a GA3 and a GA4 integration. This allows merchants to link the integrations and set a date after which the UTS tables will be populated with data from GA4 instead of GA3.
Intraday reports are currently unavailable but will be optional for merchants soon.
Entity Relationship Diagram (ERD)
Click here to view the Daasity GA4 BigQuery integration illustrating the different tables and keys to join across tables.
GA4 BigQuery Schema
The Daasity GA4 BigQuery extractor creates these tables using SQL and replication methods listed. The data is mapped from BigQuery to the table based on the mapping logic outlined in each table.
Base Landing Page
Endpoint: BigQuery Export
Update Method: UPSERT
Table Names:
[
ga4_bq_stg.base_landing_page_stg1
][
ga4_bq_stg.base_landing_page_stg2
]
SQL | Database Column |
---|---|
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(REGEXP_REPLACE(landing_page,r'\?.*',''),'')) | id |
stream_id | stream_id |
PARSE_TIMESTAMP('%Y%m%d', event_date) | created_on |
REGEXP_REPLACE(landing_page,r'\?.*','') | landing_page_path |
SUM(pageviews) | pageviews |
COUNT(DISTINCT ga_session_id) | sessions |
COUNT(DISTINCT CASE WHEN is_engaged = 1 THEN ga_session_id END) | engaged_sessions |
SUM(transactions) | transactions |
SUM(session_duration) | session_duration |
SUM(transaction_revenue) | transaction_revenue |
'events' | __source_table |
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) | __file_date |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(REGEXP_REPLACE(landing_page,r'\?.*',''),'')) | __sync_key |
'<%= integration.account.id %>' | _account_id |
CURRENT_TIMESTAMP() | __synced_at |
Base PDP Performance
Endpoint: BigQuery Export
Update Method: UPSERT
Table Names:
[
ga4_bq_stg.base_pdp_performance_stg1
][
ga4_bq_stg.base_pdp_performance_stg2
]
SQL | Database Column |
---|---|
MD5('<%= integration.account.id %>'||':'||stream_id||':'||event_date||':'||item_name||':'||item_id||':'||IFNULL(currency,'')) | id |
stream_id | stream_id |
PARSE_TIMESTAMP('%Y%m%d', event_date) | created_on |
item_name | product_name |
item_id | product_sku |
currency | currency |
COUNT(CASE WHEN event_name = 'view_item' THEN CONCAT(user_pseudo_id,event_timestamp) END) | product_detail_views |
COUNT(CASE WHEN event_name = 'add_to_cart' THEN CONCAT(user_pseudo_id,event_timestamp) END) | product_adds_to_cart |
COUNT(CASE WHEN event_name = 'begin_checkout' THEN CONCAT(user_pseudo_id,event_timestamp) END) | product_checkouts |
SUM(CASE WHEN event_name = 'purchase' THEN item_revenue ELSE 0 END) | product_sales |
COUNT(DISTINCT CASE WHEN event_name = 'purchase' THEN transaction_id END) | product_purchases |
'events' | __source_table |
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) | __file_date |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||event_date||':'||item_name||':'||item_id||':'||IFNULL(currency,'')) | __sync_key |
'<%= integration.account.id %>' | _account_id |
CURRENT_TIMESTAMP() | __synced_at |
Base Shopping Stage
Endpoint: BigQuery Export
Update Method: UPSERT
Table Names:
[
ga4_bq_stg.base_shopping_stage_stg1
][
ga4_bq_stg.base_shopping_stage_stg2
]
SQL | Database Column |
---|---|
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(source,'')||':'||IFNULL(medium,'')||':'||IFNULL(campaign,'')||':'||IFNULL(term,'')||':'||IFNULL(content,'')||':'||shopping_stage||':'||CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||device_category) | id |
stream_id | stream_id |
PARSE_TIMESTAMP('%Y%m%d', event_date) | created_on |
source | source |
medium | medium |
campaign | campaign |
shopping_stage | shopping_stage |
NULL | channel_grouping |
CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END | user_type |
device_category | device_category |
COUNT(DISTINCT ga_session_id) | sessions |
term | keyword |
content | adcontent |
NULL | channel |
NULL | vendor |
NULL | sub_channel |
NULL | media_type |
NULL | attribution_channel |
'events' | __source_table |
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) | __file_date |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(source,'')||':'||IFNULL(medium,'')||':'||IFNULL(campaign,'')||':'||IFNULL(term,'')||':'||IFNULL(content,'')||':'||shopping_stage||':'||CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||device_category) | __sync_key |
'<%= integration.account.id %>' | _account_id |
CURRENT_TIMESTAMP() | __synced_at |
Base Transactions
Endpoint: BigQuery Export
Update Method: UPSERT
Table Names:
[
ga4_bq_stg.base_transactions_stg1
][
ga4_bq_stg.base_transactions_stg2
]
SQL | Database Column |
---|---|
MD5('<%= integration.account.id %>'||':'||stream_id||':'||transaction_id) | id |
stream_id | stream_id |
PARSE_TIMESTAMP('%Y%m%d', event_date) | created_on |
transaction_id | transaction_id |
source | source |
medium | medium |
campaign | campaign |
NULL | channel_grouping |
CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END | user_type |
NULL | country_iso_code |
device_category | device_category |
NULLIF(country,'') | country |
city | city |
full_referrer | full_referrer |
region | region |
browser | browser |
operating_system | operating_system |
term | keyword |
content | adcontent |
FIRST_VALUE(purchase_revenue_in_usd IGNORE NULLS) OVER (PARTITION BY transaction_id ORDER BY event_timestamp DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) | transaction_revenue |
NULL | channel |
NULL | vendor |
NULL | sub_channel |
NULL | media_type |
NULL | attribution_channel |
'events' | __source_table |
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) | __file_date |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||transaction_id) | __sync_key |
'<%= integration.account.id %>' | _account_id |
CURRENT_TIMESTAMP() | __synced_at |
Base Traffic
Endpoint: BigQuery Export
Update Method: UPSERT
Table Names:
[
ga4_bq_stg.base_traffic_stg1
][
ga4_bq_stg.base_traffic_stg2
]
SQL | Database Column |
---|---|
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(source,'')||':'||IFNULL(medium,'')||':'||IFNULL(campaign,'')||':'||IFNULL(term,'')||':'||IFNULL(content,'')||':'||CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(device_category,'')||':'||IFNULL(country_name,'')) | id |
stream_id | stream_id |
PARSE_TIMESTAMP('%Y%m%d', event_date) | created_on |
source | source |
medium | medium |
campaign | campaign |
NULL | channel_grouping |
CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END | user_type |
device_category | device_category |
NULL | country_iso_code |
NULLIF(country_name,'') | country_name |
SUM(pageviews) | pageviews |
COUNT(DISTINCT ga_session_id) | sessions |
COUNT(DISTINCT CASE WHEN is_engaged = 1 THEN ga_session_id END) | engaged_sessions |
SUM(transactions) | transactions |
SUM(session_duration) | session_duration |
NULL | time_on_page |
SUM(transaction_revenue) | transaction_revenue |
term | keyword |
content | adcontent |
NULL | channel |
NULL | vendor |
NULL | sub_channel |
NULL | media_type |
NULL | attribution_channel |
'events' | __source_table |
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) | __file_date |
MD5('<%= integration.account.id %>'||':'||sm.stream_id||':'||sm.created_on||':'||IFNULL(ts.source,'')||':'||IFNULL(ts.medium,'')||':'||IFNULL(ts.campaign,'')||':'||IFNULL(ts.term,'')||':'||IFNULL(ts.content,'')||':'||CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(sm.device_category,'')||':'||IFNULL(sm.country_name,'')) | sync_key |
'<%= integration.account.id %>' | _account_id |
CURRENT_TIMESTAMP() | __synced_at |
Base Traffic Hourly
Endpoint: BigQuery Export
Update Method: UPSERT
Table Names:
[
ga4_bq_stg.base_traffic_hourly_history_stg1
][
ga4_bq_stg.base_traffic_hourly_history_stg2
]
SQL | Database Column |
---|---|
MD5(sm.stream_id||':'||sm.created_on||':'||DATETIME_TRUNC(DATETIME(TIMESTAMP_MICROS(st.event_timestamp)),hour)||':'||IFNULL(ts.source,'')||':'||IFNULL(ts.medium,'')||':'||IFNULL(ts.campaign,'')||':'||IFNULL(ts.term,'')||':'||IFNULL(ts.content,'')||':'||IFNULL(ts.full_referrer,'')||':'||CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(sm.device_category,'')||':'||IFNULL(sm.country_name,'')) | id |
stream_id | stream_id |
PARSE_TIMESTAMP('%Y%m%d', event_date) | created_on |
DATETIME_TRUNC(DATETIME(TIMESTAMP_MICROS(event_timestamp)),hour) AS date_hour_utc | date_hour_utc |
source | source |
medium | medium |
campaign | campaign |
NULL | channel_grouping |
term | keyword |
content | adcontent |
full_referrer | full_referrer |
CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END | user_type |
device_category | device_category |
NULL | country_iso_code |
NULLIF(sm.country_name,'') | country_name |
SUM(pageviews) | pageviews |
COUNT(DISTINCT ga_session_id) | sessions |
COUNT(DISTINCT CASE WHEN is_engaged = 1 THEN ga_session_id END) | engaged_sessions |
SUM(transactions) | transactions |
SUM(transaction_revenue) | transaction_revenue |
NULL | channel |
NULL | vendor |
NULL | sub_channel |
NULL | media_type |
NULL | attribution_channel |
'events' | __source_table |
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) | __file_date |
MD5(sm.stream_id||':'||sm.created_on||':'||DATETIME_TRUNC(DATETIME(TIMESTAMP_MICROS(st.event_timestamp)),hour)||':'||IFNULL(ts.source,'')||':'||IFNULL(ts.medium,'')||':'||IFNULL(ts.campaign,'')||':'||IFNULL(ts.term,'')||':'||IFNULL(ts.content,'')||':'||IFNULL(ts.full_referrer,'')||':'||CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(sm.device_category,'')||':'||IFNULL(sm.country_name,'')) | __sync_key |
'<%= integration.account.id %>' | _account_id |
CURRENT_TIMESTAMP() | __synced_at |
Last updated