Integration Specifications
This article will help you learn about how Daasity replicates data from GA4 BigQuery, limitations to the data we can extract and where the data is stored in the GA4 BigQuery schema.
Integration Overview
Google Analytics V4 is the newest version of Google Analytics, wherein raw event data from Google Analytics V4 properties can be exported to BigQuery and then fetched through a series of SQL queries.
This document provides context on what kind of data is being gathered through this extractor, which endpoints that data is coming from, and how the extracted tables relate to each other.
Integration Availability
This integration is available for:
Enterprise
Growth
BigQuery Export
The Daasity GA4 BigQuery extractor is built based on this GA4 BigQuery Export Documentation. Staging tables are extracted from BigQuery Exports via SQL and are then stitched together during the transformation process to create the standard GA report tables. The Daasity GA4 BigQuery extractor fetches reports in the following time periods:
Full day (tables ending in _stg1)
Intraday (tables ending in _stg2)
Hourly
NOTE:
Daasity includes a cutover date for merchants that have both a GA3 and a GA4 integration. This allows merchants to link the integrations and set a date after which the UTS tables will be populated with data from GA4 instead of GA3.
Intraday reports are currently unavailable but will be optional for merchants soon.
Entity Relationship Diagram (ERD)
Click here to view the Daasity GA4 BigQuery integration illustrating the different tables and keys to join across tables.
GA4 BigQuery Schema
The Daasity GA4 BigQuery extractor creates these tables using SQL and replication methods listed. The data is mapped from BigQuery to the table based on the mapping logic outlined in each table.
Base Landing Page
Endpoint: BigQuery Export
Update Method: UPSERT
Table Names:
[
ga4_bq_stg.base_landing_page_stg1
][
ga4_bq_stg.base_landing_page_stg2
]
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(REGEXP_REPLACE(landing_page,r'\?.*',''),''))
id
stream_id
stream_id
PARSE_TIMESTAMP('%Y%m%d', event_date)
created_on
REGEXP_REPLACE(landing_page,r'\?.*','')
landing_page_path
SUM(pageviews)
pageviews
COUNT(DISTINCT ga_session_id)
sessions
COUNT(DISTINCT CASE WHEN is_engaged = 1 THEN ga_session_id END)
engaged_sessions
SUM(transactions)
transactions
SUM(session_duration)
session_duration
SUM(transaction_revenue)
transaction_revenue
'events'
__source_table
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX)
__file_date
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(REGEXP_REPLACE(landing_page,r'\?.*',''),''))
__sync_key
'<%= integration.account.id %>'
_account_id
CURRENT_TIMESTAMP()
__synced_at
Base PDP Performance
Endpoint: BigQuery Export
Update Method: UPSERT
Table Names:
[
ga4_bq_stg.base_pdp_performance_stg1
][
ga4_bq_stg.base_pdp_performance_stg2
]
MD5('<%= integration.account.id %>'||':'||stream_id||':'||event_date||':'||item_name||':'||item_id||':'||IFNULL(currency,''))
id
stream_id
stream_id
PARSE_TIMESTAMP('%Y%m%d', event_date)
created_on
item_name
product_name
item_id
product_sku
currency
currency
COUNT(CASE WHEN event_name = 'view_item' THEN CONCAT(user_pseudo_id,event_timestamp) END)
product_detail_views
COUNT(CASE WHEN event_name = 'add_to_cart' THEN CONCAT(user_pseudo_id,event_timestamp) END)
product_adds_to_cart
COUNT(CASE WHEN event_name = 'begin_checkout' THEN CONCAT(user_pseudo_id,event_timestamp) END)
product_checkouts
SUM(CASE WHEN event_name = 'purchase' THEN item_revenue ELSE 0 END)
product_sales
COUNT(DISTINCT CASE WHEN event_name = 'purchase' THEN transaction_id END)
product_purchases
'events'
__source_table
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX)
__file_date
MD5('<%= integration.account.id %>'||':'||stream_id||':'||event_date||':'||item_name||':'||item_id||':'||IFNULL(currency,''))
__sync_key
'<%= integration.account.id %>'
_account_id
CURRENT_TIMESTAMP()
__synced_at
Base Shopping Stage
Endpoint: BigQuery Export
Update Method: UPSERT
Table Names:
[
ga4_bq_stg.base_shopping_stage_stg1
][
ga4_bq_stg.base_shopping_stage_stg2
]
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(source,'')||':'||IFNULL(medium,'')||':'||IFNULL(campaign,'')||':'||IFNULL(term,'')||':'||IFNULL(content,'')||':'||shopping_stage||':'||CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||device_category)
id
stream_id
stream_id
PARSE_TIMESTAMP('%Y%m%d', event_date)
created_on
source
source
medium
medium
campaign
campaign
shopping_stage
shopping_stage
NULL
channel_grouping
CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END
user_type
device_category
device_category
COUNT(DISTINCT ga_session_id)
sessions
term
keyword
content
adcontent
NULL
channel
NULL
vendor
NULL
sub_channel
NULL
media_type
NULL
attribution_channel
'events'
__source_table
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX)
__file_date
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(source,'')||':'||IFNULL(medium,'')||':'||IFNULL(campaign,'')||':'||IFNULL(term,'')||':'||IFNULL(content,'')||':'||shopping_stage||':'||CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||device_category)
__sync_key
'<%= integration.account.id %>'
_account_id
CURRENT_TIMESTAMP()
__synced_at
Base Transactions
Endpoint: BigQuery Export
Update Method: UPSERT
Table Names:
[
ga4_bq_stg.base_transactions_stg1
][
ga4_bq_stg.base_transactions_stg2
]
MD5('<%= integration.account.id %>'||':'||stream_id||':'||transaction_id)
id
stream_id
stream_id
PARSE_TIMESTAMP('%Y%m%d', event_date)
created_on
transaction_id
transaction_id
source
source
medium
medium
campaign
campaign
NULL
channel_grouping
CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END
user_type
NULL
country_iso_code
device_category
device_category
NULLIF(country,'')
country
city
city
full_referrer
full_referrer
region
region
browser
browser
operating_system
operating_system
term
keyword
content
adcontent
FIRST_VALUE(purchase_revenue_in_usd IGNORE NULLS) OVER (PARTITION BY transaction_id ORDER BY event_timestamp DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
transaction_revenue
NULL
channel
NULL
vendor
NULL
sub_channel
NULL
media_type
NULL
attribution_channel
'events'
__source_table
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX)
__file_date
MD5('<%= integration.account.id %>'||':'||stream_id||':'||transaction_id)
__sync_key
'<%= integration.account.id %>'
_account_id
CURRENT_TIMESTAMP()
__synced_at
Base Traffic
Endpoint: BigQuery Export
Update Method: UPSERT
Table Names:
[
ga4_bq_stg.base_traffic_stg1
][
ga4_bq_stg.base_traffic_stg2
]
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(source,'')||':'||IFNULL(medium,'')||':'||IFNULL(campaign,'')||':'||IFNULL(term,'')||':'||IFNULL(content,'')||':'||CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(device_category,'')||':'||IFNULL(country_name,''))
id
stream_id
stream_id
PARSE_TIMESTAMP('%Y%m%d', event_date)
created_on
source
source
medium
medium
campaign
campaign
NULL
channel_grouping
CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END
user_type
device_category
device_category
NULL
country_iso_code
NULLIF(country_name,'')
country_name
SUM(pageviews)
pageviews
COUNT(DISTINCT ga_session_id)
sessions
COUNT(DISTINCT CASE WHEN is_engaged = 1 THEN ga_session_id END)
engaged_sessions
SUM(transactions)
transactions
SUM(session_duration)
session_duration
NULL
time_on_page
SUM(transaction_revenue)
transaction_revenue
term
keyword
content
adcontent
NULL
channel
NULL
vendor
NULL
sub_channel
NULL
media_type
NULL
attribution_channel
'events'
__source_table
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX)
__file_date
MD5('<%= integration.account.id %>'||':'||sm.stream_id||':'||sm.created_on||':'||IFNULL(ts.source,'')||':'||IFNULL(ts.medium,'')||':'||IFNULL(ts.campaign,'')||':'||IFNULL(ts.term,'')||':'||IFNULL(ts.content,'')||':'||CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(sm.device_category,'')||':'||IFNULL(sm.country_name,''))
sync_key
'<%= integration.account.id %>'
_account_id
CURRENT_TIMESTAMP()
__synced_at
Base Traffic Hourly
Endpoint: BigQuery Export
Update Method: UPSERT
Table Names:
[
ga4_bq_stg.base_traffic_hourly_history_stg1
][
ga4_bq_stg.base_traffic_hourly_history_stg2
]
MD5(sm.stream_id||':'||sm.created_on||':'||DATETIME_TRUNC(DATETIME(TIMESTAMP_MICROS(st.event_timestamp)),hour)||':'||IFNULL(ts.source,'')||':'||IFNULL(ts.medium,'')||':'||IFNULL(ts.campaign,'')||':'||IFNULL(ts.term,'')||':'||IFNULL(ts.content,'')||':'||IFNULL(ts.full_referrer,'')||':'||CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(sm.device_category,'')||':'||IFNULL(sm.country_name,''))
id
stream_id
stream_id
PARSE_TIMESTAMP('%Y%m%d', event_date)
created_on
DATETIME_TRUNC(DATETIME(TIMESTAMP_MICROS(event_timestamp)),hour) AS date_hour_utc
date_hour_utc
source
source
medium
medium
campaign
campaign
NULL
channel_grouping
term
keyword
content
adcontent
full_referrer
full_referrer
CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END
user_type
device_category
device_category
NULL
country_iso_code
NULLIF(sm.country_name,'')
country_name
SUM(pageviews)
pageviews
COUNT(DISTINCT ga_session_id)
sessions
COUNT(DISTINCT CASE WHEN is_engaged = 1 THEN ga_session_id END)
engaged_sessions
SUM(transactions)
transactions
SUM(transaction_revenue)
transaction_revenue
NULL
channel
NULL
vendor
NULL
sub_channel
NULL
media_type
NULL
attribution_channel
'events'
__source_table
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX)
__file_date
MD5(sm.stream_id||':'||sm.created_on||':'||DATETIME_TRUNC(DATETIME(TIMESTAMP_MICROS(st.event_timestamp)),hour)||':'||IFNULL(ts.source,'')||':'||IFNULL(ts.medium,'')||':'||IFNULL(ts.campaign,'')||':'||IFNULL(ts.term,'')||':'||IFNULL(ts.content,'')||':'||IFNULL(ts.full_referrer,'')||':'||CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(sm.device_category,'')||':'||IFNULL(sm.country_name,''))
__sync_key
'<%= integration.account.id %>'
_account_id
CURRENT_TIMESTAMP()
__synced_at
Last updated