复制记录以填补Google BigQuery中日期之间的差距

时间:2016-08-13 04:15:00

标签: google-bigquery

所以我找到了类似的资源来解决如何在SQL中执行此操作,如下所示: Duplicating records to fill gap between dates

我知道BigQuery可能不是最好的地方,所以我试着看看它是否完全可能。当我尝试运行上面链接中的一些方法时,由于BigQuery中不支持某些功能,因此我遇到了问题。

如果存在数据结构如下的表:

    MODIFY_DATE             SKU         STORE   STOCK_ON_HAND
    08/01/2016 00:00:00     1120010     21      100
    08/05/2016 00:00:00     1120010     21      75
    08/07/2016 00:00:00     1120010     21      40

如何在Google BigQuery中构建一个产生如下输出的查询?重复给定日期的值,直到下一次更改日期:

    MODIFY_DATE             SKU         STORE   STOCK_ON_HAND
    08/01/2016 00:00:00     1120010     21      100
    08/02/2016 00:00:00     1120010     21      100
    08/03/2016 00:00:00     1120010     21      100
    08/04/2016 00:00:00     1120010     21      100
    08/05/2016 00:00:00     1120010     21      75
    08/06/2016 00:00:00     1120010     21      75
    08/07/2016 00:00:00     1120010     21      40

我知道我需要生成一个包含给定范围内所有日期的表,但是我很难理解是否可以这样做。有什么想法吗?

1 个答案:

答案 0 :(得分:1)

  

如何在Google BigQuery中构建一个产生如下输出的查询?重复给定日期的值,直到

之间的日期的下一次更改为止

见下面的例子

SELECT
  MODIFY_DATE, 
  MAX(SKU_TEMP) OVER(PARTITION BY grp) AS SKU,
  MAX(STORE_TEMP) OVER(PARTITION BY grp) AS STORE,
  MAX(STOCK_ON_HAND_TEMP) OVER(PARTITION BY grp) AS STOCK_ON_HAND,
FROM (
  SELECT
    DAY AS MODIFY_DATE, SKU AS SKU_TEMP, STORE AS STORE_TEMP, STOCK_ON_HAND AS STOCK_ON_HAND_TEMP,
    COUNT(SKU) OVER(ORDER BY DAY ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS grp,
  FROM (
    SELECT DATE(DATE_ADD(TIMESTAMP("2016-08-01"), pos - 1, "DAY")) AS DAY
    FROM (
         SELECT ROW_NUMBER() OVER() AS pos, *
         FROM (FLATTEN((
         SELECT SPLIT(RPAD('', 1 + DATEDIFF(TIMESTAMP("2016-08-07"), TIMESTAMP("2016-08-01")), '.'),'') AS h
         FROM (SELECT NULL)),h
    )))
  ) AS DATES
  LEFT JOIN (
    SELECT DATE(MODIFY_DATE) AS MODIFY_DATE, SKU, STORE, STOCK_ON_HAND 
    FROM 
      (SELECT "2016-08-01" AS MODIFY_DATE, "1120010" AS SKU, 21 AS STORE, 75 AS STOCK_ON_HAND),
      (SELECT "2016-08-05" AS MODIFY_DATE, "1120010" AS SKU, 22 AS STORE, 100 AS STOCK_ON_HAND),
      (SELECT "2016-08-07" AS MODIFY_DATE, "1120011" AS SKU, 23 AS STORE, 40 AS STOCK_ON_HAND),
  ) AS TABLE_WITH_GAPS
  ON TABLE_WITH_GAPS.MODIFY_DATE = DATES.DAY
)
ORDER BY MODIFY_DATE