要在BigQuery中查询不同的事件,我想(首先)动态地确定以前的日期,然后(其次)在我的脚本中替换它们。这将阻止我每周手动更新这些日期,无论何时我拉这个查询(这对我来说是一个重要因素)。
对我来说,首要任务是动态确定前几周的情况。根据{{3}}开始和结束日期:
ISO week date standard (ISO-8601)
在撰写本文时(2017年10月31日),以下2个查询的输出为:2017-10-23(上周一)和2017-10-29(最近周日)。然后可以更改其中一个值并获得前两周的值。到目前为止,太棒了!
上周一:
# Legacy SQL select date(date_add(current_date(), if(dayofweek(current_date()) = 1, -6, -(dayofweek(current_date()) + 5)), "DAY"))
最近的星期天:
# Legacy SQL select date(date_add(current_date(), if(dayofweek(current_date()) = 1, -6, -(dayofweek(current_date()) - 1)), "DAY"))
问题是:
# Standard SQL SELECT event.name as event_name, COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20171009' AND _TABLE_SUFFIX < '20171016' THEN user_dim.app_info.app_instance_id END) AS Week_1, COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20171016' AND _TABLE_SUFFIX < '20171023' THEN user_dim.app_info.app_instance_id END) AS Week_2, COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20171023' AND _TABLE_SUFFIX < '20171030' THEN user_dim.app_info.app_instance_id END) AS Week_3 -- Remember that Week 43 ends with 29 October, but we use < 30 October. FROM `<<project-id>>.app_events_*`, UNNEST(event_dim) AS event WHERE _TABLE_SUFFIX >= '20171009' AND _TABLE_SUFFIX < '20171030' GROUP BY event_name
解答:
在玩了一些米哈伊尔先前的建议之后,我得到了以下代码。此代码对于计数事件运行速度非常快,但对于计数不同的事件,运行时间最长可达40秒。我仍然需要以某种方式优化它,但我想我会发布这么长的答案。
SELECT event.name as event_name,
COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX
BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 3 * 7 + EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 2 DAY))
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 2 * 7 + EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 1 DAY))
THEN user_dim.app_info.app_instance_id END) AS Minus_3_Weeks,
COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX
BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 2 * 7 + EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 2 DAY))
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 * 7 + EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 1 DAY))
THEN user_dim.app_info.app_instance_id END) AS Minus_2_Weeks,
COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX
BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 * 7 + EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 2 DAY))
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 1 DAY))
THEN user_dim.app_info.app_instance_id END) AS Minus_1_Week
FROM `<<project-id>>.app_events_*`, UNNEST(event_dim) AS event
WHERE _TABLE_SUFFIX
BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 3 * 7 + EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 2 DAY))
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 1 DAY))
GROUP BY event_name;