最近3周BigQuery中的动态每周日期范围

时间:2017-10-31 19:09:24

标签: google-bigquery firebase-analytics

要在BigQuery中查询不同的事件,我想(首先)动态地确定以前的日期,然后(其次)在我的脚本中替换它们。这将阻止我每周手动更新这些日期,无论何时我拉这个查询(这对我来说是一个重要因素)。

对我来说,首要任务是动态确定前几周的情况。根据{{​​3}}开始和结束日期:

ISO week date standard (ISO-8601)

在撰写本文时(2017年10月31日),以下2个查询的输出为:2017-10-23(上周一)和2017-10-29(最近周日)。然后可以更改其中一个值并获得前两周的值。到目前为止,太棒了!

  

上周一:

# Legacy SQL
select date(date_add(current_date(), if(dayofweek(current_date()) = 1, -6, -(dayofweek(current_date()) + 5)), "DAY"))
     

最近的星期天:

# Legacy SQL 
select date(date_add(current_date(), if(dayofweek(current_date()) = 1, -6, -(dayofweek(current_date()) - 1)), "DAY"))

问题是:

  • 如何将我的SQL代码(如下所示)从旧版迁移到标准 SQL,并将下面的所有静态日期替换为(以上 - 提到)动态日期?
# Standard SQL
SELECT
event.name as event_name,
COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20171009' AND _TABLE_SUFFIX < '20171016' THEN user_dim.app_info.app_instance_id END) AS Week_1,
COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20171016' AND _TABLE_SUFFIX < '20171023' THEN user_dim.app_info.app_instance_id END) AS Week_2,
COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20171023' AND _TABLE_SUFFIX < '20171030' THEN user_dim.app_info.app_instance_id END) AS Week_3
-- Remember that Week 43 ends with 29 October, but we use < 30 October. 
FROM `<<project-id>>.app_events_*`, UNNEST(event_dim) AS event
WHERE  _TABLE_SUFFIX >= '20171009' AND _TABLE_SUFFIX < '20171030'
GROUP BY event_name

解答:

在玩了一些米哈伊尔先前的建议之后,我得到了以下代码。此代码对于计数事件运行速度非常快,但对于计数不同的事件,运行时间最长可达40秒。我仍然需要以某种方式优化它,但我想我会发布这么长的答案。

SELECT event.name as event_name, 
COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX 
  BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 3 * 7 + EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 2 DAY)) 
  AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 2 * 7 + EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 1 DAY))
THEN user_dim.app_info.app_instance_id END) AS Minus_3_Weeks,
COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX 
  BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 2 * 7 + EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 2 DAY)) 
  AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 * 7 + EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 1 DAY))
THEN user_dim.app_info.app_instance_id END) AS Minus_2_Weeks,  
COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX 
  BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 * 7 + EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 2 DAY)) 
  AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 1 DAY))
THEN user_dim.app_info.app_instance_id END) AS Minus_1_Week
FROM `<<project-id>>.app_events_*`, UNNEST(event_dim) AS event
WHERE _TABLE_SUFFIX 
  BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 3 * 7 + EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 2 DAY)) 
  AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL EXTRACT(DAYOFWEEK FROM CURRENT_DATE()) - 1 DAY))
GROUP BY event_name;

0 个答案:

没有答案