我有这样的月度销售数据
Company Month Sales
Adidas 2018-09 100
Adidas 2018-08 95
Adidas 2018-07 120
Adidas 2018-06 155
...and so on
我需要添加另一列说明median over the past 12 months
(如果没有可用的12个月,则列出尽可能多的数据)。
在Python中,我想出了如何使用for
循环来执行此操作,但不确定在BigQuery中如何执行。
谢谢!
答案 0 :(得分:1)
这是一种可行的方法:
CREATE TEMP FUNCTION MEDIAN(arr ANY TYPE) AS ((
SELECT
IF(
MOD(ARRAY_LENGTH(arr), 2) = 0,
(arr[OFFSET(DIV(ARRAY_LENGTH(arr), 2) - 1)] + arr[OFFSET(DIV(ARRAY_LENGTH(arr), 2))]) / 2,
arr[OFFSET(DIV(ARRAY_LENGTH(arr), 2))]
)
FROM (SELECT ARRAY_AGG(x ORDER BY x) AS arr FROM UNNEST(arr) AS x)
));
SELECT
Company,
Month,
MEDIAN(
ARRAY_AGG(Sales) OVER (PARTITION BY Company ORDER BY Month ROWS BETWEEN 11 PRECEDING AND CURRENT ROW)
) AS trailing_median
FROM (
SELECT 'Adidas' AS Company, '2018-09' AS Month, 100 AS Sales UNION ALL
SELECT 'Adidas', '2018-08', 95 UNION ALL
SELECT 'Adidas', '2018-07', 120 UNION ALL
SELECT 'Adidas', '2018-06', 155
);
结果是:
+---------+---------+-----------------+
| Company | Month | trailing_median |
+---------+---------+-----------------+
| Adidas | 2018-06 | 155.0 |
| Adidas | 2018-07 | 137.5 |
| Adidas | 2018-08 | 120.0 |
| Adidas | 2018-09 | 110.0 |
+---------+---------+-----------------+