根据交易/事件计算金额

时间:2019-02-03 14:44:08

标签: sql google-bigquery

不知道这是否在正确的类别中,但是我正在尝试计算sql中的金额。 这就是我所拥有的:

一个带有事务的表。基本上就是这样。每笔交易代表从货架上“存入”或“取出”产品。没有货架上的产品数量数据,现在我想计算一天中所有货架上的产品数量。每天。

表格:

Transaction Datetime, Source, Destination, Product ID, Product Group
2019-02-01 08:01:00, Person1, Shelf1, 1234, 1
2019-02-01 10:01:00, Shelf1, Person1, 1234, 1
2019-02-01 08:03:00, Person2, Shelf1, 5678, 1
...

所需表:

Hour, Date, Shelf, Product Group, Amount
8, 2019-02-01, Shelf1, 1, 5
9, 2019-02-01, Shelf1, 1, 10
10, 2019-02-01, Shelf1, 1, 10

任何想法如何做到这一点?任何建议将不胜感激

溴 克里斯

2 个答案:

答案 0 :(得分:1)

我将使用datetime_trunc()并将小时与日期放在同一列。

但是基本思想是“切换”行,因此货架始终是来源,并添加金额指示符(反之亦然)。

您可以使用累积总和在每个小时结束时获得净额。或者只是使用计划汇总来获取小时内的更改。

select datetime_trunc(transaction_datetime, hour) date yyyymmddhh,
       Shelf, Product_Group,
       sum(inc) as changes_this_hour,
       sum(sum(inc)) over (partition by shelf,  product_id, product_group order by min(transaction_datetime)) as net_amount
from ((select transaction_datetime,
              source, destination, product_id, product_group,
              1 as inc
       from t
       where source like 'Shelf%'
      )
      union all
      (select transaction_datetime,
              destination, source,  product_id, product_group,
              -1 as inc
       from t
       where destination like 'Shelf%'
      )
     ) t
group by yyyymmddhh, Shelf,  product_id, Product_Group
order by Shelf, Product_Group;

答案 1 :(得分:1)

以下是用于BigQuery标准SQL

想法是先收集所有

  • 产品的位置(来源和目的地)
  • 所有产品组
  • 最后是全天/全天为
  • 创建报告

然后CROSS JOIN上面的三个和LEFT JOIN的主要数据分别显示小时,日期,位置和product_group,同时根据位置表示的是源还是目的地来计算金额。

接下来,所有金额都在同一小时/天之内分组,并且显然是位置和Product_Group
最后,应用解析函数计算累计量

带有采样数据的最终代码在

之下
#standardSQL
WITH `project.dataset.table` AS (
  SELECT DATETIME '2019-02-01 08:01:00' Transaction_Datetime, 'Person1' Source, 'Shelf1' Destination, 1234 Product_ID, 1 Product_Group UNION ALL
  SELECT '2019-02-01 10:01:00', 'Shelf1', 'Person1', 1234, 1 UNION ALL
  SELECT '2019-02-01 08:03:00', 'Person2', 'Shelf1', 5678, 1 
), hours AS (
  SELECT EXTRACT(HOUR FROM hour) hour, DATE(hour) day
  FROM (
    SELECT 
      MIN(TIMESTAMP(Transaction_Datetime)) min_ts,
      MAX(TIMESTAMP(Transaction_Datetime)) max_ts
    FROM `project.dataset.table`
  ), UNNEST(GENERATE_TIMESTAMP_ARRAY(
    TIMESTAMP_TRUNC(min_ts, HOUR),
    TIMESTAMP_TRUNC(max_ts, HOUR),
    INTERVAL 1 HOUR)) hour
), locations AS (
  SELECT Source AS location FROM `project.dataset.table`
  UNION DISTINCT 
  SELECT Destination FROM `project.dataset.table`
), product_groups AS (
  SELECT DISTINCT Product_Group FROM `project.dataset.table`
), temp AS (
  SELECT 
    EXTRACT(HOUR FROM Transaction_Datetime) hour,
    DATE(Transaction_Datetime) day,
    Source, Destination, Product_ID, Product_Group
  FROM `project.dataset.table`
)
SELECT hour, day, location, product_group,
  SUM(delta) OVER(PARTITION BY location, product_group ORDER BY hour, day) amount
FROM (
  SELECT 
    hours.hour, hours.day, location, product_groups.product_group,
    SUM(CASE location 
      WHEN Source THEN -1
      WHEN Destination THEN 1
      ELSE 0
    END) delta 
  FROM locations, hours, product_groups
  LEFT JOIN temp t
  ON t.hour = hours.hour
  AND t.day = hours.day
  AND t.product_group = product_groups.product_group
  AND location IN (Source, Destination)
  GROUP BY hours.hour, hours.day, location, Product_Group
)
WHERE LOWER(location) LIKE 'shelf%' 
-- ORDER BY hour, day, location

有结果

Row hour    day         location    product_group   amount   
1   8       2019-02-01  Shelf1      1       2    
2   9       2019-02-01  Shelf1      1       2    
3   10      2019-02-01  Shelf1      1       1    

注意:您的问题尚不清楚如何区分ShelfPerson-因此使用LOWER(location) LIKE 'shelf%'。您可以对此进行调整,以使用与此相关的任何逻辑
如果您删除该行-您不仅会获得货架上的金额-而且还会获得每个人的“手上”的产品余额

要测试您的表-在下面运行-不要忘记用完整的表引用替换`project.dataset.table`

#standardSQL
WITH hours AS (
  SELECT EXTRACT(HOUR FROM hour) hour, DATE(hour) day
  FROM (
    SELECT 
      MIN(TIMESTAMP(Transaction_Datetime)) min_ts,
      MAX(TIMESTAMP(Transaction_Datetime)) max_ts
    FROM `project.dataset.table`
  ), UNNEST(GENERATE_TIMESTAMP_ARRAY(
    TIMESTAMP_TRUNC(min_ts, HOUR),
    TIMESTAMP_TRUNC(max_ts, HOUR),
    INTERVAL 1 HOUR)) hour
), locations AS (
  SELECT Source AS location FROM `project.dataset.table`
  UNION DISTINCT 
  SELECT Destination FROM `project.dataset.table`
), product_groups AS (
  SELECT DISTINCT Product_Group FROM `project.dataset.table`
), temp AS (
  SELECT 
    EXTRACT(HOUR FROM Transaction_Datetime) hour,
    DATE(Transaction_Datetime) day,
    Source, Destination, Product_ID, Product_Group
  FROM `project.dataset.table`
)
SELECT hour, day, location, product_group,
  SUM(delta) OVER(PARTITION BY location, product_group ORDER BY hour, day) amount
FROM (
  SELECT 
    hours.hour, hours.day, location, product_groups.product_group,
    SUM(CASE location 
      WHEN Source THEN -1
      WHEN Destination THEN 1
      ELSE 0
    END) delta 
  FROM locations, hours, product_groups
  LEFT JOIN temp t
  ON t.hour = hours.hour
  AND t.day = hours.day
  AND t.product_group = product_groups.product_group
  AND location IN (Source, Destination)
  GROUP BY hours.hour, hours.day, location, Product_Group
)
WHERE LOWER(location) LIKE 'shelf%' 
-- ORDER BY hour, day, location