我正在计算特定项目最近100次销售的移动平均值。我想知道用户X是否在最近100个销售窗口中的那个项目上花费了其他所有人5倍以上的费用。
--how much has the current row user spent on this item over the last 100 sales?
SUM(saleprice) OVER(PARTITION BY item, user ORDER BY saledate ROWS BETWEEN 100 PRECEDING AND CURRENT ROW)
--pseudocode: how much has everyone else, excluding this user, spent on that item over the last 100 sales?
SUM(saleprice) OVER(PARTITION BY item ORDER BY saledate ROWS BETWEEN 100 PRECEDING AND CURRENT ROW WHERE preceding_row.user <> current_row.ruser)
最终,我不希望大支出者的购买金额被计入小支出者的总支出中。如果某行不符合当前行的某些比较条件,是否可以从窗口中排除行? (就我而言,如果它与当前行具有相同的用户,则不要对前一行的销售价格求和)
答案 0 :(得分:3)
第一个对我来说不错,除了您要计算101个销售。 (在AND和当前行之前100)
--how much has the current row user spent on this item over the last 100 sales?
SUM(saleprice)
OVER (
PARTITION BY item, user
ORDER BY saledate
ROWS BETWEEN 100 PRECEDING AND 1 PRECEDING -- 100 excluding this sale
ROWS BETWEEN 99 PRECEDING AND CURRENT ROW -- 100 including this sale
)
(只需使用两个建议的ROWS BETWEEN
子句中的一个)
在第二个表达式中,您不能添加WHERE
子句。您可以更改聚合,分区和排序,但是我看不到这对您有什么帮助。我认为您需要相关的子查询和/或使用OUTER APPLY
...
SELECT
*,
SUM(saleprice)
OVER (
PARTITION BY item, user
ORDER BY saledate
ROWS BETWEEN 99 PRECEDING AND CURRENT ROW -- 100 including this sale
)
AS user_total_100_purchases_to_date,
others_sales_top_100_total.sale_price
FROM
sales_data
OUTER APPLY
(
SELECT
SUM(saleprice) AS saleprice
FROM
(
SELECT TOP(100) saleprice
FROM sales_data others_sales
WHERE others_sales.user <> sales_data.user
AND others_sales.item = sales_data.item
AND others_sales.saledate <= sales_data.saledate
ORDER BY others_sales.saledate DESC
)
AS others_sales_top_100
)
AS others_sales_top_100_total
编辑: 的另一种查看方式,使事情变得一致
SELECT
*,
usr_last100_saletotal,
all_last100_saletotal,
CASE WHEN usr_last100_saletotal > all_last100_saletotal * 0.8
THEN 'user spent 80%, or more, of last 100 sales'
ELSE 'user spent under 80% of last 100 sales'
END
AS
FROM
sales_data
OUTER APPLY
(
SELECT
SUM(CASE top100.user WHEN sales_data.user THEN top100.saleprice END) AS usr_last100_saletotal,
SUM( top100.saleprice ) AS all_last100_saletotal
FROM
(
SELECT TOP(100) user, saleprice
FROM sales_data AS others_sales
WHERE others_sales.item = sales_data.item
AND others_sales.saledate <= sales_data.saledate
ORDER BY others_sales.saledate DESC
)
AS top100
)
AS top100_summary