如何转换以下数据框的最佳方法还是添加“状态”的总和?
在:
plan type hour status total
A cont 0 ok 10
A cont 0 notok 3
A cont 0 other 1
A vend 1 ok 7
A vend 1 notok 2
A vend 1 other 0
B test 5 ok 20
B test 5 notok 6
B test 5 other 13
后:
plan type hour ok notok other sum
A cont 0 10 3 1 14
A vend 1 7 2 0 9
B test 5 20 6 13 39
提前致谢!
答案 0 :(得分:0)
你可以
In [9]: dff = df.pivot_table(index=['plan', 'type', 'hour'], columns='status',
values='total')
In [10]: dff['sum'] = dff.sum(axis=1)
In [11]: dff.reset_index()
Out[11]:
status plan type hour notok ok other sum
0 A cont 0 3 10 1 14
1 A vend 1 2 7 0 9
2 B test 5 6 20 13 39
答案 1 :(得分:0)
使用set_index
+ unstack
进行重新整形,按assign
添加新列,使用reset_index
添加rename_axis
:
df = (df.set_index(['plan', 'type', 'hour', 'status'])['total']
.unstack()
.assign(sum=lambda x: x.sum(1))
.reset_index()
.rename_axis(None, 1))
print (df)
plan type hour notok ok other sum
0 A cont 0 3 10 1 14
1 A vend 1 2 7 0 9
2 B test 5 6 20 13 39
如果不是由plan, type, hour
定义的唯一三元组,则使用groupby
和mean
等集合函数或其他答案:
print (df)
plan type hour status total
0 A cont 0 ok 10 <- duplicate 10 for plan, type, hour
1 A cont 0 ok 100 <- duplicate 100 for plan, type, hour
2 A cont 0 notok 3
3 A cont 0 other 1
4 A vend 1 ok 7
5 A vend 1 notok 2
6 A vend 1 other 0
7 B test 5 ok 20
8 B test 5 notok 6
9 B test 5 other 13
df = (df.groupby(['plan', 'type', 'hour', 'status'])['total'].mean()
.unstack()
.assign(sum=lambda x: x.sum(1))
.reset_index()
.rename_axis(None, 1))
print (df)
plan type hour notok ok other sum
0 A cont 0 3 55 1 59 <- 55 = (100 + 10) / 2
1 A vend 1 2 7 0 9
2 B test 5 6 20 13 39