我正在尝试比较一个星期和接下来三年中使用SQL的三年情况

时间:2019-03-06 17:36:03

标签: sql database google-bigquery

我正在尝试计算一个星期与下一个星期之间的商店访问者之间的差异,并且我使用的查询仅计算同一年的几个星期之间的差异,而不是一年中最后一个星期与明年第一周(2016年第53周,2017年第1周)!

这是我的桌子的样子


|日期|年|周|店铺名称| Number_Of_Vistors |


任何人都可以进行三年查询吗?

这就是我写查询的方式:

SELECT
    (base.Store_Visitors-lw.Store_Visitors)/lw.Store_Visitors AS VARIANCE
FROM
  `myproject` base
JOIN (
  SELECT
    *, extract(WEEK FROM (DATE_ADD(DATE(TIMESTAMP(date)) , INTERVAL 1 Week))) AS n_week

  FROM
    `myproject` ) lw
ON
  base.WEEK = (lw.n_week-1)
  AND base.YEAR = lw.YEAR
  AND base.DAYOFWEEK = lw.DAYOFWEEK

  AND base.Store_Name = lw.Store_Name

2 个答案:

答案 0 :(得分:1)

您需要按星期和年份对数据进行行编号,然后加入该数据或其他一些非重复值。

<Directory /var/www/>
    Options Indexes FollowSymLinks
    AllowOverride All
    Require all granted
</Directory>

答案 1 :(得分:1)

以下内容适用于BigQuery标准SQL,并使用解析函数代替自连接

#standardSQL
WITH temp AS (
  SELECT 
    EXTRACT(YEAR FROM t.date) year, 
    EXTRACT(WEEK FROM t.date) week, 
    Store_Name, 
    Number_Of_Vistors
  FROM `project.dataset.table` t
)
SELECT Store_Name, year, week, 
  (Number_Of_Vistors - ANY_VALUE(Number_Of_Vistors) 
    OVER(PARTITION BY Store_Name, year ORDER BY week RANGE BETWEEN 1 PRECEDING AND 1 PRECEDING)
  ) / Number_Of_Vistors AS variance
FROM temp t   

您可以使用下面的示例中的虚拟数据来测试,玩游戏

#standardSQL
WITH `project.dataset.table` AS (
  SELECT DATE '2018-12-02' `date`, 'abc' Store_Name, 11 Number_Of_Vistors UNION ALL
  SELECT '2018-12-09', 'abc', 22 UNION ALL
  SELECT '2018-12-16', 'abc', 33 UNION ALL
  SELECT '2018-12-23', 'abc', 44 UNION ALL
  SELECT '2018-12-30', 'abc', 55 UNION ALL
  SELECT '2019-01-06', 'abc', 66 UNION ALL
  SELECT '2019-01-13', 'abc', 77 UNION ALL
  SELECT '2019-01-20', 'abc', 88 
), temp AS (
  SELECT 
    EXTRACT(YEAR FROM t.date) year, 
    EXTRACT(WEEK FROM t.date) week, 
    Store_Name, 
    Number_Of_Vistors
  FROM `project.dataset.table` t
)
SELECT Store_Name, year, week, 
  (Number_Of_Vistors - ANY_VALUE(Number_Of_Vistors) 
    OVER(PARTITION BY Store_Name, year ORDER BY week RANGE BETWEEN 1 PRECEDING AND 1 PRECEDING)
  ) / Number_Of_Vistors AS variance
FROM temp t
ORDER BY Store_Name, year, week   

有结果

Row Store_Name  year    week    variance     
1   abc         2018    48      null     
2   abc         2018    49      0.5  
3   abc         2018    50      0.3333333333333333   
4   abc         2018    51      0.25     
5   abc         2018    52      0.2  
6   abc         2019    1       null     
7   abc         2019    2       0.14285714285714285  
8   abc         2019    3       0.125     

注意:由于您的问题尚不清楚您的数据如何准确表示-我假设您每年每家商店每周有一行

您应该可以根据实际数据类型/用例进行以上调整