我的数据框:
df = pd.DataFrame([["2012-01-06",1,"a",2],["2012-01-06",1,"b",3],["2012-01-06",1,"b",4],["2012-01-06",1,"b",5],["2012-02-06",2,"a",3],["2012-02-06",2,"a",4],["2012-02-06",3,"b",3],["2012-03-06",5,"b",3]],columns=["date","id","type", "amount"])
df = df.set_index(["date","id","type"])
df
导致:
amount
date id type
2012-01-06 1 a 2
b 3
b 4
b 5
2012-02-06 2 a 3
a 4
3 b 3
2012-03-06 5 b 3
执行分组后
gr = df.groupby(df.index).agg({'amount':sum})
gr
我得到:
amount
(2012-01-06, 1, a) 2
(2012-01-06, 1, b) 12
(2012-02-06, 2, a) 7
(2012-02-06, 3, b) 3
(2012-03-06, 5, b) 3
我需要将gr
变成与原始df
相同的结构,即
amount
date id type
2012-01-06 1 a 2
b 12
2012-02-06 2 a 7
3 b 3
2012-03-06 5 b 3
答案 0 :(得分:1)
使用数据框的级别分组:
df.groupby(level=[0,1,2]).amount.sum()
date id type
2012-01-06 1 a 2
b 12
2012-02-06 2 a 7
3 b 3
2012-03-06 5 b 3
Name: amount, dtype: int64
答案 1 :(得分:0)
使用带有level
参数的Series.sum
和带有groupby
参数的level
的工作原理相同:
s = df.amount.sum(level=[0,1,2])
print (s)
date id type
2012-01-06 1 a 2
b 12
2012-02-06 2 a 7
3 b 3
2012-03-06 5 b 3
Name: amount, dtype: int64