不知道这是否在正确的类别中,但是我正在尝试计算sql中的金额。 这就是我所拥有的:
一个带有事务的表。基本上就是这样。每笔交易代表从货架上“存入”或“取出”产品。没有货架上的产品数量数据,现在我想计算一天中所有货架上的产品数量。每天。
表格:
Transaction Datetime, Source, Destination, Product ID, Product Group
2019-02-01 08:01:00, Person1, Shelf1, 1234, 1
2019-02-01 10:01:00, Shelf1, Person1, 1234, 1
2019-02-01 08:03:00, Person2, Shelf1, 5678, 1
...
所需表:
Hour, Date, Shelf, Product Group, Amount
8, 2019-02-01, Shelf1, 1, 5
9, 2019-02-01, Shelf1, 1, 10
10, 2019-02-01, Shelf1, 1, 10
任何想法如何做到这一点?任何建议将不胜感激
溴 克里斯
答案 0 :(得分:1)
我将使用datetime_trunc()
并将小时与日期放在同一列。
但是基本思想是“切换”行,因此货架始终是来源,并添加金额指示符(反之亦然)。
您可以使用累积总和在每个小时结束时获得净额。或者只是使用计划汇总来获取小时内的更改。
select datetime_trunc(transaction_datetime, hour) date yyyymmddhh,
Shelf, Product_Group,
sum(inc) as changes_this_hour,
sum(sum(inc)) over (partition by shelf, product_id, product_group order by min(transaction_datetime)) as net_amount
from ((select transaction_datetime,
source, destination, product_id, product_group,
1 as inc
from t
where source like 'Shelf%'
)
union all
(select transaction_datetime,
destination, source, product_id, product_group,
-1 as inc
from t
where destination like 'Shelf%'
)
) t
group by yyyymmddhh, Shelf, product_id, Product_Group
order by Shelf, Product_Group;
答案 1 :(得分:1)
以下是用于BigQuery标准SQL
想法是先收集所有
然后CROSS JOIN
上面的三个和LEFT JOIN
的主要数据分别显示小时,日期,位置和product_group,同时根据位置表示的是源还是目的地来计算金额。
接下来,所有金额都在同一小时/天之内分组,并且显然是位置和Product_Group
最后,应用解析函数计算累计量
带有采样数据的最终代码在
之下#standardSQL
WITH `project.dataset.table` AS (
SELECT DATETIME '2019-02-01 08:01:00' Transaction_Datetime, 'Person1' Source, 'Shelf1' Destination, 1234 Product_ID, 1 Product_Group UNION ALL
SELECT '2019-02-01 10:01:00', 'Shelf1', 'Person1', 1234, 1 UNION ALL
SELECT '2019-02-01 08:03:00', 'Person2', 'Shelf1', 5678, 1
), hours AS (
SELECT EXTRACT(HOUR FROM hour) hour, DATE(hour) day
FROM (
SELECT
MIN(TIMESTAMP(Transaction_Datetime)) min_ts,
MAX(TIMESTAMP(Transaction_Datetime)) max_ts
FROM `project.dataset.table`
), UNNEST(GENERATE_TIMESTAMP_ARRAY(
TIMESTAMP_TRUNC(min_ts, HOUR),
TIMESTAMP_TRUNC(max_ts, HOUR),
INTERVAL 1 HOUR)) hour
), locations AS (
SELECT Source AS location FROM `project.dataset.table`
UNION DISTINCT
SELECT Destination FROM `project.dataset.table`
), product_groups AS (
SELECT DISTINCT Product_Group FROM `project.dataset.table`
), temp AS (
SELECT
EXTRACT(HOUR FROM Transaction_Datetime) hour,
DATE(Transaction_Datetime) day,
Source, Destination, Product_ID, Product_Group
FROM `project.dataset.table`
)
SELECT hour, day, location, product_group,
SUM(delta) OVER(PARTITION BY location, product_group ORDER BY hour, day) amount
FROM (
SELECT
hours.hour, hours.day, location, product_groups.product_group,
SUM(CASE location
WHEN Source THEN -1
WHEN Destination THEN 1
ELSE 0
END) delta
FROM locations, hours, product_groups
LEFT JOIN temp t
ON t.hour = hours.hour
AND t.day = hours.day
AND t.product_group = product_groups.product_group
AND location IN (Source, Destination)
GROUP BY hours.hour, hours.day, location, Product_Group
)
WHERE LOWER(location) LIKE 'shelf%'
-- ORDER BY hour, day, location
有结果
Row hour day location product_group amount
1 8 2019-02-01 Shelf1 1 2
2 9 2019-02-01 Shelf1 1 2
3 10 2019-02-01 Shelf1 1 1
注意:您的问题尚不清楚如何区分Shelf
与Person
-因此使用LOWER(location) LIKE 'shelf%'
。您可以对此进行调整,以使用与此相关的任何逻辑
如果您删除该行-您不仅会获得货架上的金额-而且还会获得每个人的“手上”的产品余额
要测试您的表-在下面运行-不要忘记用完整的表引用替换`project.dataset.table`
#standardSQL
WITH hours AS (
SELECT EXTRACT(HOUR FROM hour) hour, DATE(hour) day
FROM (
SELECT
MIN(TIMESTAMP(Transaction_Datetime)) min_ts,
MAX(TIMESTAMP(Transaction_Datetime)) max_ts
FROM `project.dataset.table`
), UNNEST(GENERATE_TIMESTAMP_ARRAY(
TIMESTAMP_TRUNC(min_ts, HOUR),
TIMESTAMP_TRUNC(max_ts, HOUR),
INTERVAL 1 HOUR)) hour
), locations AS (
SELECT Source AS location FROM `project.dataset.table`
UNION DISTINCT
SELECT Destination FROM `project.dataset.table`
), product_groups AS (
SELECT DISTINCT Product_Group FROM `project.dataset.table`
), temp AS (
SELECT
EXTRACT(HOUR FROM Transaction_Datetime) hour,
DATE(Transaction_Datetime) day,
Source, Destination, Product_ID, Product_Group
FROM `project.dataset.table`
)
SELECT hour, day, location, product_group,
SUM(delta) OVER(PARTITION BY location, product_group ORDER BY hour, day) amount
FROM (
SELECT
hours.hour, hours.day, location, product_groups.product_group,
SUM(CASE location
WHEN Source THEN -1
WHEN Destination THEN 1
ELSE 0
END) delta
FROM locations, hours, product_groups
LEFT JOIN temp t
ON t.hour = hours.hour
AND t.day = hours.day
AND t.product_group = product_groups.product_group
AND location IN (Source, Destination)
GROUP BY hours.hour, hours.day, location, Product_Group
)
WHERE LOWER(location) LIKE 'shelf%'
-- ORDER BY hour, day, location