BigQuery在同一查询中展平GA会话和命中级别字段

时间:2018-01-30 15:22:42

标签: sql google-analytics google-bigquery bigdata standard-sql

在标准SQL中,我希望能够在同一查询中查询以下所有内容

  • customDimensions
  • hits.customDimensions
  • hits.customMetrics
  • hits.product.customDimensions

到目前为止,我已经想出了类似的东西(包括两个GA属性的UNION,一个用于移动设备,另一个用于桌面) - 我将添加比这更多的列,我可以&# 39;想象把它们全部作为子选择是最好的方法:

SELECT
# standard session fields
date,
fullVisitorId,
visitId,
visitNumber,
TIMESTAMP_SECONDS(visitStartTime) visitStartTime,
totals.visits,
device.deviceCategory,
totals.hits,
totals.newVisits,
totals.pageviews,
totals.timeOnSite,
trafficSource.adContent,
trafficSource.campaign,
trafficSource.keyword,
trafficSource.medium,
trafficSource.referralPath,
trafficSource.source,
channelGrouping,
device.browser,
device.browserSize,
device.browserVersion,
device.mobileDeviceInfo,
device.mobileDeviceModel,
device.operatingSystem,
device.mobileDeviceBranding,
geoNetwork.country,
geoNetwork.city,
# Session/User customDimension Example
(SELECT cd.value FROM UNNEST(customDimensions) cd WHERE cd.index=20 and REGEXP_CONTAINS(cd.value, "\\d")) userId,

# hits customMetrics Example
SUM((SELECT SUM(hcm.value) FROM UNNEST(hits) h,UNNEST(h.customMetrics) hcm WHERE hcm.index=28)) totalBooking,

# hits customDimension Example
SUM((SELECT COUNT(hcd.value) FROM UNNEST(hits) h,UNNEST(h.customDimensions) hcd WHERE hcd.index=20)) h_cd1,

# hits products customDimension Example
SUM((SELECT COUNT(hpc.value) FROM UNNEST(hits) h,UNNEST(h.product) hp,UNNEST(hp.customDimensions) hpc WHERE hpc.index=1)) h_pc1,
SUM((SELECT COUNT(hpc.value) FROM UNNEST(hits) h,UNNEST(h.product) hp,UNNEST(hp.customDimensions) hpc WHERE hpc.index=2)) h_pc2,
SUM((SELECT COUNT(hpc.value) FROM UNNEST(hits) h,UNNEST(h.product) hp,UNNEST(hp.customDimensions) hpc WHERE hpc.index=3)) h_pc3,
SUM((SELECT COUNT(hpc.value) FROM UNNEST(hits) h,UNNEST(h.product) hp,UNNEST(hp.customDimensions) hpc WHERE hpc.index=4)) h_pc4,
SUM((SELECT COUNT(hpc.value) FROM UNNEST(hits) h,UNNEST(h.product) hp,UNNEST(hp.customDimensions) hpc WHERE hpc.index=5)) h_pc5,
SUM((SELECT COUNT(hpc.value) FROM UNNEST(hits) h,UNNEST(h.product) hp,UNNEST(hp.customDimensions) hpc WHERE hpc.index=6)) h_pc6,
SUM((SELECT COUNT(hpc.value) FROM UNNEST(hits) h,UNNEST(h.product) hp,UNNEST(hp.customDimensions) hpc WHERE hpc.index=7)) h_pc7,
SUM((SELECT COUNT(hpc.value) FROM UNNEST(hits) h,UNNEST(h.product) hp,UNNEST(hp.customDimensions) hpc WHERE hpc.index=8)) h_pc8,
SUM((SELECT COUNT(hpc.value) FROM UNNEST(hits) h,UNNEST(h.product) hp,UNNEST(hp.customDimensions) hpc WHERE hpc.index=9)) h_pc9,

FROM (
  SELECT
  *
  FROM
  `abcdefgh.12345678.ga_sessions_*` desktopProperty
  WHERE
  _TABLE_SUFFIX BETWEEN '20180122' AND '20180122'
  UNION ALL
  SELECT
  *
  FROM
  `abcdefgh.12345678.ga_sessions_*` mobileProperty
  WHERE
  _TABLE_SUFFIX BETWEEN '20180122' AND '20180122'
  ) table
GROUP BY
  date,
  fullVisitorId,
  visitId,
  visitNumber,
  visitStartTime,
  totals.visits,
  device.deviceCategory,
  totals.hits,
  totals.newVisits,
  totals.pageviews,
  totals.timeOnSite,
  trafficSource.adContent,
  trafficSource.campaign,
  trafficSource.keyword,
  trafficSource.medium,
  trafficSource.referralPath,
  trafficSource.source,
  channelGrouping,
  device.browser,
  device.browserSize,
  device.browserVersion,
  device.mobileDeviceInfo,
  device.mobileDeviceModel,
  device.operatingSystem,
  device.mobileDeviceBranding,
  geoNetwork.country,
  geoNetwork.city,
  userId
ORDER BY
date

1 个答案:

答案 0 :(得分:1)

如果您保持会话级别,则不需要GROUP BY,因为源表已在会话范围内。所以你可以摆脱它。然后,您还必须删除子选择周围的聚合函数。只需将所有内容都放在所需范围内即可 -

将不同范围引入会话级别的示例:

SELECT
  date AS sessionScope,
  (SELECT count(hitNumber) FROM t.hits) hitScope,
  (SELECT SUM(IF(cd.value='example',1,0)) FROM t.hits AS h, h.customDimensions cd WHERE cd.index=1) hitCdScope,
  (SELECT COUNT(p.productSku) FROM t.hits AS h, h.product AS p WHERE h.ecommerceaction.action_type='6') productScope
FROM
  `project.dataset.ga_sessions_20180127` AS t
WHERE
  totals.transactions > 0
LIMIT
  1000