我想显示3级索引数据框的前2个级别的前2个结果(通过pivot_table)
db.TEmployees
.Where(m => m.Status == Enums.Status.Active &&
SqlFunctions.PatIndex("%[0-9]%", m.EmployeeName) == 0)
//...
问题1 :如何只获得每年月份前两个配置文件的组合? 所以
奖金问题: 如何获取非前2个配置文件的总和并将其称为“其他' 所以
我想将它作为数据框返回给我
答案 0 :(得分:0)
<强>更新强>
没有转动:
In [120]: srt = df.sort_values(['year','month','profile'])
In [123]: srt[srt.groupby(['year','month'])['profile'].rank(method='min') <= 2]
Out[123]:
year month profile ranking sales
0 2015 1 A R1 70
6 2015 1 B R2 50
4 2015 2 A R1 30
1 2015 2 B R2 40
5 2015 3 A R3 20
8 2015 3 B R3 10
奖金回答:
In [131]: srt[srt.groupby(['year','month'])['profile'] \
.rank(method='min') >= 2] \
.groupby(['year','month']).agg({'sales':'sum'})
Out[131]:
sales
year month
2015 1 150
2 130
3 30
使用旋转:您可以尝试在旋转后重置索引:
In [109]: pvt = df.pivot_table(values = 'sales',
.....: index = ['year','month','profile'],
.....: columns = ['ranking'],
.....: aggfunc = 'sum',
.....: fill_value = 0,
.....: margins = True).reset_index()
In [111]: pvt
Out[111]:
ranking year month profile R1 R2 R3 All
0 2015 1 A 70 0 0 70
1 2015 1 B 0 50 0 50
2 2015 1 C 0 0 10 10
3 2015 1 D 0 90 0 90
4 2015 2 A 30 0 0 30
5 2015 2 B 0 40 0 40
6 2015 2 C 90 0 0 90
7 2015 3 A 0 0 20 20
8 2015 3 B 0 0 10 10
9 2015 3 C 0 0 20 20
10 All 190 180 60 430
现在您可以使用rank()
方法:
In [110]: pvt[pvt.sort_values(['year','month','profile']).groupby(['year','month'])['profile'].rank(method='min') <= 2]
Out[110]:
ranking year month profile R1 R2 R3 All
0 2015 1 A 70 0 0 70
1 2015 1 B 0 50 0 50
4 2015 2 A 30 0 0 30
5 2015 2 B 0 40 0 40
7 2015 3 A 0 0 20 20
8 2015 3 B 0 0 10 10
10 All 190 180 60 430
排名:
In [112]: pvt.sort_values(['year','month','profile']).groupby(['year','month'])['profile'].rank(method='min')
Out[112]:
0 1
1 2
2 3
3 4
4 1
5 2
6 3
7 1
8 2
9 3
10 1
dtype: float64