我有某种带有ID,键和值的事件队列。
我想将此表按键分组,并为每一行汇总该键的所有先前值
(类似于cumsum,但array_aggregate)
我知道如何使用SQL:
WITH t AS (
SELECT *
FROM (
VALUES
(1, 'A', 1),
(2, 'B', 1),
(3, 'A', 2),
(4, 'A', 3),
(5, 'A', 5),
(6, 'B', 8)
) AS v(id, key, val)
) SELECT
*,
array_agg(val)
OVER (
PARTITION BY key
ORDER BY id )
FROM t
ORDER BY id
将导致:
id, key, val, array_agg
1, A, 1, {1}
2, B, 1, {1}
3, A, 2, {1,2}
4, A, 3, {1,2,3}
5, A, 5, {1,2,3,5}
6, B, 8, {1,8}
如果我有相同的表,在python中做的最好方法是什么?
import pandas as pd
df = pd.DataFrame([
(1, 'A', 1),
(2, 'B', 1),
(3, 'A', 2),
(4, 'A', 3),
(5, 'A', 5),
(6, 'B', 8)
], columns=['id', 'key', 'val'])