我在BigQuery上运行以下查询:
WITH allTables AS (
SELECT
CONCAT(user_dim.app_info.app_id, ':', user_dim.app_info.app_platform) AS app,
user_dim.app_info.app_instance_id AS users
FROM `dataset1.app_events_*`, UNNEST(event_dim) AS event
WHERE _TABLE_SUFFIX BETWEEN '20170406' AND '20170406'
OR _TABLE_SUFFIX BETWEEN 'intraday_20170406' AND 'intraday_20170406'
UNION ALL
SELECT
CONCAT(user_dim.app_info.app_id, ':', user_dim.app_info.app_platform) AS app,
user_dim.app_info.app_instance_id AS users
FROM `dataset2.app_events_*`, UNNEST(event_dim) AS event
WHERE _TABLE_SUFFIX BETWEEN '20170406' AND '20170406'
OR _TABLE_SUFFIX BETWEEN 'intraday_20170406' AND 'intraday_20170406'
)
SELECT
app AS target,
COUNT(DISTINCT(users)) AS datapoint_value,
UNIX_SECONDS(PARSE_TIMESTAMP('%Y%m%d', '20170406')) AS datapoint_time
FROM allTables
GROUP BY app
最终查询会比这要大得多,但这只是一个简单的例子。我遇到的问题是,如果不满足where条件,则不会返回任何内容。我想更改此查询,以便在不满足where时,它将返回不同的数据。有没有办法在BigQuery中执行此操作?任何帮助都会很棒,谢谢!
答案 0 :(得分:0)
不确定,但使用where条件作为app和用户的case语句可能会有所帮助。手动将日期作为时间戳插入并解析,以确保您获取每个子查询的数据。
WITH allTables AS (
SELECT
CASE WHEN _TABLE_SUFFIX BETWEEN '20170406' AND '20170406'
OR _TABLE_SUFFIX BETWEEN 'intraday_20170406' AND 'intraday_20170406'
THEN CONCAT(user_dim.app_info.app_id, ':', user_dim.app_info.app_platform)
ELSE NULL END AS app,
CASE WHEN _TABLE_SUFFIX BETWEEN '20170406' AND '20170406'
OR _TABLE_SUFFIX BETWEEN 'intraday_20170406' AND 'intraday_20170406'
THEN user_dim.app_info.app_instance_id
ELSE NULL END AS users,
'20170406' as timestamp
FROM `dataset1.app_events_*`, UNNEST(event_dim) AS event
UNION ALL
SELECT
CASE WHEN _TABLE_SUFFIX BETWEEN '20170406' AND '20170406'
OR _TABLE_SUFFIX BETWEEN 'intraday_20170406' AND 'intraday_20170406'
THEN CONCAT(user_dim.app_info.app_id, ':', user_dim.app_info.app_platform)
ELSE NULL END AS app,
CASE WHEN _TABLE_SUFFIX BETWEEN '20170406' AND '20170406'
OR _TABLE_SUFFIX BETWEEN 'intraday_20170406' AND 'intraday_20170406'
THEN user_dim.app_info.app_instance_id
ELSE NULL END AS users,
'20170406' as timestamp
FROM `dataset2.app_events_*`, UNNEST(event_dim) AS event
)
SELECT
app AS target,
COUNT(DISTINCT(users)) AS datapoint_value,
UNIX_SECONDS(PARSE_TIMESTAMP('%Y%m%d', timestamp)) AS datapoint_time
FROM allTables
GROUP BY app
答案 1 :(得分:0)
结束不需要为我的项目执行此操作,而是我只是每天为表运行一个插入,然后更新这些表。