使用UNNEST的BigQuery标准SQL复制数据

时间:2018-05-21 13:38:49

标签: google-bigquery

我正在使用Bigquery标准sql来计算谷歌分析数据,但是当我使用unfst来破坏表中的重复记录字段时,其他列如命中计数数据变得重复并显示比实际值更多的值

SELECT
    date,
    trafficSource.source as source,
    trafficSource.medium as medium,
    SUM(totals.hits) AS total_hit,
    MAX(hits.transaction.transactionid) as transaction
FROM
    `test.test.session_streaming_*`,unnest(hits) hits
WHERE
    _table_suffix BETWEEN '20180401'
    AND '20180501'   
GROUP BY
    date,
    trafficSource.source,
    trafficSource.medium

有人可以帮我告诉我们如何删除此查询中的重复数据

1 个答案:

答案 0 :(得分:1)

您希望计算每行hits内的最大事务ID,然后在所有行中获取最大值。这应该有效:

SELECT
  date,
  trafficSource.source as source,
  trafficSource.medium as medium,
  SUM(totals.hits) AS total_hit,
  MAX((SELECT MAX(transaction.transactionid) FROM UNNEST(hits))) as transaction
FROM
  `test.test.session_streaming_*`
WHERE
  _table_suffix BETWEEN '20180401'
    AND '20180501'
GROUP BY
  date,
  trafficSource.source,
  trafficSource.medium