我知道this link,但未能解决问题。
我在pandas.DataFrame.groupby().sum()
的DataFrame下面有这个文件:
Value
Level Company Item
1 X a 100
b 200
Y a 35
b 150
c 35
2 X a 48
b 100
c 50
Y a 80
,并希望为我必须获得的每个索引级别添加总计行:
Value
Level Company Item
1 X a 100
b 200
Total 300
Y a 35
b 150
c 35
Total 520
Total 820
2 X a 48
b 100
c 50
Total 198
Y a 80
Total 80
Total 278
Total 1098
根据要求
level = list(map(int, list('111112222')))
company = list('XXYYYXXXY')
item = list('ababcabca')
value = [100,200,35,150,35,48,100,50,80]
col = ['Level', 'Company', 'Item', 'Value']
df = pd.DataFrame([level,company,item,value]).T
df.columns = col
df.groupby(['Level', 'Company', 'Item']).sum()
答案 0 :(得分:1)
您可以使用:
m=df.groupby(['Level','Company','Item'])['Value'].sum().unstack()
m.assign(total=m.sum(1)).stack().to_frame('Value')
Value
Level Company Item
1 X a 100.0
b 200.0
total 300.0
Y a 35.0
b 150.0
c 35.0
total 220.0
2 X a 48.0
b 100.0
c 50.0
total 198.0
Y a 80.0
total 80.0
答案 1 :(得分:1)
尝试一下:基本上,这是使用两个组的总和创建三个新的df,并压缩三个数据帧
level = list(map(int, list('111112222')))
company = list('XXYYYXXXY')
item = list('ababcabca')
value = [100,200,35,150,35,48,100,50,80]
col = ['Level', 'Company', 'Item', 'Value']
df = pd.DataFrame([level,company,item,value]).T
df.columns = col
df1 = (df.groupby(['Level', 'Company', 'Item'])['Value'].sum())
df2 = (df1.sum(level=0).to_frame().assign(Company='total').set_index('Company', append=True))
df3 = (df1.groupby(['Level','Company']).sum().to_frame().assign(Item='total').set_index('Item', append=True))
dfx = pd.concat([df1.to_frame().reset_index(),
df2.reset_index(),
df3.reset_index()],sort=False)
print(dfx)
输出:
Level Company Item Value
0 1 X a 100
1 1 X b 200
2 1 Y a 35
3 1 Y b 150
4 1 Y c 35
5 2 X a 48
6 2 X b 100
7 2 X c 50
8 2 Y a 80
0 1 total NaN 520
1 2 total NaN 278
0 1 X total 300
1 1 Y total 220
2 2 X total 198
3 2 Y total 80
尽管没有如您所愿,但没有进行排序。 如果我在不重置索引的情况下合并了3个df,则会得到预期的排序顺序,但是索引是多索引列
dfx = pd.concat([df1.to_frame(), df2, df3]).sort_index()
输出
Value
(1, X, a) 100
(1, X, b) 200
(1, X, total) 300
(1, Y, a) 35
(1, Y, b) 150
(1, Y, c) 35
(1, Y, total) 220
(1, total) 520
(2, X, a) 48
(2, X, b) 100
(2, X, c) 50
(2, X, total) 198
(2, Y, a) 80
(2, Y, total) 80
(2, total) 278
我不确定如何将其转换为df中的列。
答案 2 :(得分:1)
您可以尝试一次将其堆叠一层:
m = df.groupby(['Level','Company','Item'])['Value'].sum().unstack(level=['Company','Item'])
m = m.assign(total=m.sum(1))
m = m.stack(level='Company')
m = m.assign(total=m.sum(1))
m = m.stack(level='Item')
输出总重复如下:
Level Company Item
1 X a 100.0
b 200.0
total 300.0
Y a 35.0
b 150.0
c 35.0
total 220.0
total 520.0
total 520.0
2 X a 48.0
b 100.0
c 50.0
total 198.0
Y a 80.0
total 80.0
total 278.0
total 278.0
dtype: float64