我正在尝试使用如here所示的小计来实现一个表,但要么该代码不适用于最新的pandas版本(0.18.1),要么示例错误用于多列而不是一列。 My code here会产生下表
2014 2015 2016
project__name person__username activity__name issue__subject
Influenster employee1 Development 161.0 122.0 104.0
Fix bug 22.0 0.0 0.0
Refactor view 0.0 7.0 0.0
Quality assurance 172.0 158.0 161.0
employee2 Development 119.0 137.0 155.0
Quality assurance 193.0 186.0 205.0
employee3 Development Refactor view 0.0 0.0 1.0
Profit tools employee1 Development 177.0 136.0 216.0
Quality assurance 162.0 122.0 182.0
employee2 Development 154.0 168.0 124.0
Quality assurance 130.0 183.0 192.0
Fix bug 22.0 0.0 0.0
All 1312.0 1219.0 1340.0
我希望的输出类似于:
2014 2015 2016
project__name person__username activity__name issue__subject
Influenster employee1 Development 161.0 122.0 104.0
Fix bug 22.0 0.0 0.0
Refactor view 0.0 7.0 0.0
Total xxx xxx xxx
Quality assurance 172.0 158.0 161.0
Total xxx xxx xxx
Total xxx xxx xxx
employee2 Development 119.0 137.0 155.0
Total xxx xxx xxx
Quality assurance 193.0 186.0 205.0
Total xxx xxx xxx
Total xxx xxx xxx
employee3 Development Refactor view 0.0 0.0 1.0
Total xxx xxx xxx
Total xxx xxx xxx
Total xxx xxx xxx
Profit tools employee1 Development 177.0 136.0 216.0
Total xxx xxx xxx
Quality assurance 162.0 122.0 182.0
Total xxx xxx xxx
Total xxx xxx xxx
employee2 Development 154.0 168.0 124.0
Total xxx xxx xxx
Quality assurance 130.0 183.0 192.0
Fix bug 22.0 0.0 0.0
Total xxx xxx xxx
Total xxx xxx xxx
Total xxx xxx xxx
All 1312.0 1219.0 1340.0
有关如何实现这一目标的任何帮助表示赞赏。
答案 0 :(得分:2)
考虑使用堆栈运行三级pivot_tables并将它们连接起来以获得最终的groupby对象。如上所述,如果您在相应的pivot_table列值上看到使用.stack()
,则文档会起作用:
# ISSUE_SUBJECT PIVOT
pt1 = pd.pivot_table(data=df, values=['2014', '2015', '2016'],
columns=['issue__subject'], aggfunc=np.sum,
index=['project__name', 'person__username', 'activity__name'],
margins=True, margins_name = 'Total')
pt1 = pt1.stack().reset_index()
# ACTIVITY_NAME PIVOT
pt2 = pd.pivot_table(data=df, values=['2014', '2015', '2016'],
columns=['activity__name'], aggfunc=np.sum,
index=['project__name', 'person__username'],
margins=True, margins_name = 'Total' )
pt2 = pt2.stack().reset_index()
# PERSON_USERNAME PIVOT
pt3 = pd.pivot_table(data=df, values=['2014', '2015', '2016'],
columns=['person__username'],
aggfunc=np.sum, index=['project__name'],
margins=True, margins_name = 'Total')
pt3 = pt3.stack().reset_index()
# CONCATENATE ALL THREE
gdf = pd.concat([pt1,
pt2[(pt2['project__name']=='Total') |
(pt2['activity__name']=='Total')],
pt3[(pt3['project__name']=='Total') |
(pt3['person__username']=='Total')]]).reset_index(drop=True)
# REPLACE NaNS IN COLUMN
gdf = gdf.apply(lambda x: np.where(pd.isnull(x), '', x), axis=1)
# FINAL GROUPBY (A COUNT USED TO RENDER GROUPBY)
gdf = gdf.groupby(['project__name', 'person__username',
'activity__name', 'issue__subject',
'2014', '2015', '2016']).agg(len)
<强>输出强>
project__name person__username activity__name issue__subject 2014 2015 2016
Influenster Total 667.0 610.0 626.0 1
employee1 Development 161.0 122.0 104.0 1
Fix bug 22.0 0.0 0.0 1
Refactor view 0.0 7.0 0.0 1
Total 183.0 129.0 104.0 1
Quality assurance 172.0 158.0 161.0 1
Total 172.0 158.0 161.0 1
Total 355.0 287.0 265.0 1
employee2 Development 119.0 137.0 155.0 1
Total 119.0 137.0 155.0 1
Quality assurance 193.0 186.0 205.0 1
Total 193.0 186.0 205.0 1
Total 312.0 323.0 360.0 1
employee3 Development Refactor view 0.0 0.0 1.0 1
Total 0.0 0.0 1.0 1
Total 0.0 0.0 1.0 1
Profit tools Total 645.0 609.0 714.0 1
employee1 Development 177.0 136.0 216.0 1
Total 177.0 136.0 216.0 1
Quality assurance 162.0 122.0 182.0 1
Total 162.0 122.0 182.0 1
Total 339.0 258.0 398.0 1
employee2 Development 154.0 168.0 124.0 1
Total 154.0 168.0 124.0 1
Quality assurance 130.0 183.0 192.0 1
Fix bug 22.0 0.0 0.0 1
Total 152.0 183.0 192.0 1
Total 306.0 351.0 316.0 1
Total 1268.0 1212.0 1339.0 1
Fix bug 44.0 0.0 0.0 1
Refactor view 0.0 7.0 1.0 1
Total 1312.0 1219.0 1340.0 1
Development 633.0 570.0 600.0 1
Quality assurance 679.0 649.0 740.0 1
Total 1312.0 1219.0 1340.0 1
Total 1312.0 1219.0 1340.0 1
employee1 694.0 545.0 663.0 1
employee2 618.0 674.0 676.0 1
employee3 0.0 0.0 1.0 1