从其他桌子填充瀑布的空白

时间:2014-11-21 01:31:29

标签: google-bigquery qsqlquery waterfall

我有两张桌子:

1。来自预测表的原始预测数据,由快照日期拉动,我使用数据创建瀑布看起来像这样

item/snapshot        forecast weeks
123    | 8/25/14 | 9/1/14 | 9/8/14 | 9/15/14
--------------------------------------------
8/24/14|  7661   | 4980   | 588    | 2232
8/31/14|         | 8319   | 1968   | 2760
9/7/14 |         |        | 6931   | 684
9/14/14|         |        |        | 9328

行标签是快照日期,列标签是预测周数。 基本上,数据的快照日期很多,每个快照将从此快照日期开始提供预测数据,并将在预测周内显示。第一个快照日期x将预测超过x的周数,第二个快照日期y将仅预测超过y的周数。


然后我有表2,来自消费表的消费数据。我将使用消费表中的一周来匹配预测表中的一周,以插入消耗来填充瀑布中的空白。

如果我在excel中手动执行,那么将在8/25周预测,周数为35,然后我从消耗表中找到第35周并插入此处。因此,所有快照日期的第35周都是相同的。

看起来像这样:

item/snapshot        forecast weeks
123    | 8/25/14 | 9/1/14 | 9/8/14 | 9/15/14
--------------------------------------------
8/24/14|  7661   | 4980   | 588    | 2232
8/31/14|  2222   | 8319   | 1968   | 2760
9/7/14 |  2222   | 333    | 6931   | 684
9/14/14|  2222   | 333    | 444    | 9328

但问题出在我的预测表中,例如,第一个快照日期将预测第1周到第10周,但第二个快照日期只会预测第2周到第10周。 我不知道如何以及是否有可能使这个过程自动化bigquery sql,因为空白基本上意味着没有数据,没有预测周。

如果有人能给我一些想法,我将非常感激

这是我的剧本:

 //Get item info from forecast table
DEFINE INLINE TABLE t1
SELECT CONCAT(SUBSTR(snapshot_date, -4, 4),'-',SUBSTR(snapshot_date, -10, 2),'-', SUBSTR(snapshot_date, -7, 2)) snapshot, 
item_name, 
type, 
item_description, 
CONCAT(SUBSTR(forecast_week_start_date, -4, 4),'-',SUBSTR(forecast_week_start_date, -10, 2),'-', SUBSTR(forecast_week_start_date, -7, 2)) forecast_week_start_date, 
SUM(quantity) qty, 
forecast_week_number, 
forecast_year_number,
CONCAT(STRING(forecast_year_number),'-',STRING(forecast_week_number) year_week
FROM forecast
WHERE 
concat(SUBSTR(snapshot_date, -4, 4),'-',SUBSTR(snapshot_date, -10, 2),'-', SUBSTR(snapshot_date, -7, 2)) >= 
strftime_usec(date_add(TIME_USEC_TO_WEEK(date_add(now(),-84 ,'DAY'),1),-1,'DAY'),'%Y-%m-%d')
GROUP BY snapshot, 
item_name, 
type, 
item_description, 
forecast_week_start_date, 
forecast_week_number, 
forecast_year_number,
year_week
ORDER BY sdm_week_start_date

//Get min year_week to use later
DEFINE INLINE TABLE t2
SELECT MIN(year_week) min_year_week
FROM t1


//Get consumption data and apply using dc deploy week
SELECT 
snapshot, 
item_name, 
type, 
item_description, 
forecast_week_start_date, 
qty,
forecast_week_number, 
forecast_year_number,
year_week
IF(t2.min_year_week!= year_week, qty+ABS(consumption_qty),qty)) quantity,

FROM t1
LEFT JOIN ALL 
 (SELECT item_name, week,sum(transaction_quantity) consumption_qt
  FROM consumption 
  GROUP BY item_name,week) inv
ON t1.year_week=inv.week AND t1.item_name=inv.item_name
CROSS JOIN t2

1 个答案:

答案 0 :(得分:1)

我对此采取了一个措施。

这两个查询使用示例中的值生成表。假设 第一个查询的输出写入 consumption_table

SELECT *
FROM
  (SELECT 123 AS item, '8/25/14' AS date, 2222 AS quantity),
  (SELECT 123 AS item, '9/1/14' AS date, 333 AS quantity),
  (SELECT 123 AS item, '9/8/14' AS date, 444 AS quantity),
  (SELECT 123 AS item, '9/15/14' AS date, 0 AS quantity);

第二个查询的输出将写入 forecast_table

SELECT *
FROM
  (SELECT 123 AS item, '8/24/14' AS snapshot, '8/25/14' AS forecast, 7661 AS quantity),
  (SELECT 123 AS item, '8/24/14' AS snapshot, '9/1/14' AS forecast, 4980 AS quantity),
  (SELECT 123 AS item, '8/24/14' AS snapshot, '9/8/14' AS forecast, 588 AS quantity),
  (SELECT 123 AS item, '8/24/14' AS snapshot, '9/15/14' AS forecast, 2232 AS quantity),
  (SELECT 123 AS item, '8/31/14' AS snapshot, '9/1/14' AS forecast, 8319 AS quantity),
  (SELECT 123 AS item, '8/31/14' AS snapshot, '9/8/14' AS forecast, 1968 AS quantity),
  (SELECT 123 AS item, '8/31/14' AS snapshot, '9/15/14' AS forecast, 2760 AS quantity),
  (SELECT 123 AS item, '9/7/14' AS snapshot, '9/8/14' AS forecast, 6931 AS quantity),
  (SELECT 123 AS item, '9/7/14' AS snapshot, '9/15/14' AS forecast, 684 AS quantity),
  (SELECT 123 AS item, '9/14/14' AS snapshot, '9/15/14' AS forecast, 9328 AS quantity);

然后,以下查询会生成您想要的内容:

SELECT
    Consumed.item AS item,
    Consumed.snapshot AS snapshot,
    Consumed.date AS date,
    IF (Forecast.quantity IS NULL, Consumed.quantity, Forecast.quantity) AS quantity
FROM
    (SELECT
        C.item     AS item,
        S.snapshot AS snapshot,
        C.date     AS date,
        C.quantity AS quantity
     FROM
        (SELECT *
         FROM
            (SELECT '8/24/14' AS snapshot),
            (SELECT '8/31/14' AS snapshot),
            (SELECT '9/7/14' AS snapshot),
            (SELECT '9/14/14' AS snapshot)) AS S
     CROSS JOIN
        consumption_table AS C) AS Consumed
LEFT JOIN
    forecast_table AS Forecast
ON Consumed.item = Forecast.item AND 
   Consumed.snapshot = Forecast.snapshot AND
   Consumed.date = Forecast.forecast;

此查询的关键是CROSS JOIN生成所有所需的输出行,其中已包含已消耗的数量。然后,LEFT JOIN会保留所有这些行,并在可用时选择预测数量。