我正在使用Bigquery标准sql来计算谷歌分析数据,但是当我使用unfst来破坏表中的重复记录字段时,其他列如命中计数数据变得重复并显示比实际值更多的值
SELECT
date,
trafficSource.source as source,
trafficSource.medium as medium,
SUM(totals.hits) AS total_hit,
MAX(hits.transaction.transactionid) as transaction
FROM
`test.test.session_streaming_*`,unnest(hits) hits
WHERE
_table_suffix BETWEEN '20180401'
AND '20180501'
GROUP BY
date,
trafficSource.source,
trafficSource.medium
有人可以帮我告诉我们如何删除此查询中的重复数据
答案 0 :(得分:1)
您希望计算每行hits
内的最大事务ID,然后在所有行中获取最大值。这应该有效:
SELECT
date,
trafficSource.source as source,
trafficSource.medium as medium,
SUM(totals.hits) AS total_hit,
MAX((SELECT MAX(transaction.transactionid) FROM UNNEST(hits))) as transaction
FROM
`test.test.session_streaming_*`
WHERE
_table_suffix BETWEEN '20180401'
AND '20180501'
GROUP BY
date,
trafficSource.source,
trafficSource.medium