我有一个简单的表,包含人物,日期和数量:
Person Date Qty
Jim 08/01/16 1
Jim 08/02/16 3
Jim 08/03/16 2
Jim 08/04/16 1
Jim 08/05/16 1
Jim 08/06/16 6
Sheila 08/01/16 1
Sheila 08/02/16 1
Sheila 08/03/16 1
Sheila 08/04/16 1
Sheila 08/05/16 1
Sheila 08/06/16 1
我想计算两列:累计总计和总计百分比,结果如下表所示:
Person Date Qty cum tot pct of tot
Jim 08/01/16 1 1 7%
Jim 08/02/16 3 4 29%
Jim 08/03/16 2 6 43%
Jim 08/04/16 1 7 50%
Jim 08/05/16 1 8 57%
Jim 08/06/16 6 14 100%
Sheila 08/01/16 1 1 17%
Sheila 08/02/16 1 2 33%
Sheila 08/03/16 1 3 50%
Sheila 08/04/16 1 4 67%
Sheila 08/05/16 1 5 83%
Sheila 08/06/16 1 6 100%
通过这个数据集,我想确定每个人的日期达到50%(或我提供的任何其他百分比)的日期。
因此50%阈值的输出为:
Jim 08/04/16
Sheila 08/03/16
有关如何计算两列并确定适当日期的任何建议?
答案 0 :(得分:2)
您可以使用ANSI标准累积和函数来计算累积和。其余的只是算术:
select t.*
from (select t.*,
sum(qty) over (partition by person order by date) as running_qty,
sum(qty) over (partition by person) as tot_qty,
(sum(qty) over (partition by person order by date) * 1.0 /
sum(qty) over (partition by person)
) as running_percent
from sales t
) t
where running_percent >= 0.5 and
running_percent - (qty * 1.0 / tot_qty) < 0.5;
where
子句有两个条件的原因是返回一行。第一个将返回大于或等于0.5的所有行,但您只需要第一个 - 百分比超过阈值。
* 1.0
是因为某些数据库进行整数除法。