我有以下DataFrame
两组动物和每天吃多少食物,
df = pd.DataFrame({'animals': ['cat', 'cat', 'dog', 'dog', 'rat',
'cat', 'rat', 'rat', 'dog', 'cat'],
'food': [1, 2, 2, 5, 3, 1, 4, 0, 6, 5]},
index=pd.MultiIndex.from_product([['group1'] + ['group2'],
list(range(5))])
).rename_axis(['groups', 'day'])
df
animals food
groups day
group1 0 cat 1
1 cat 2
2 dog 2
3 dog 5
4 rat 3
group2 0 cat 1
1 rat 4
2 rat 0
3 dog 6
4 cat 5
我可以"映射" /将其转换为新列,以了解每天应该为每只动物提供多少食物daily_meal
。
df['daily_meal'] = df.groupby(['animals', 'groups']).transform('mean')
df
animals food daily_meal
groups day
group1 0 cat 1 1.5
1 cat 2 1.5
2 dog 2 3.5
3 dog 5 3.5
4 rat 3 3.0
group2 0 cat 1 3.0
1 rat 4 2.0
2 rat 0 2.0
3 dog 6 6.0
4 cat 5 3.0
我现在想知道daily_meal在每个组中的排名,并且" map" /将其转换为名为group_rank
的新列。我怎么能这样做?
e.g。
animals food daily_meal group_rank
groups day
group1 0 cat 1 1.5 1
1 cat 2 1.5 1
2 dog 2 3.5 3
3 dog 5 3.5 3
4 rat 3 3.0 2
group2 0 cat 1 3.0 2
1 rat 4 2.0 1
2 rat 0 2.0 1
3 dog 6 6.0 3
4 cat 5 3.0 2
答案 0 :(得分:6)
使用双transform
:
df['daily_meal'] = df.groupby(['animals', 'groups'])['food'].transform('mean')
df['group_rank'] = df.groupby('groups')['daily_meal'].rank(method='dense')
print (df)
animals food daily_meal group_rank
groups day
group1 0 cat 1 1.5 1.0
1 cat 2 1.5 1.0
2 dog 2 3.5 3.0
3 dog 5 3.5 3.0
4 rat 3 3.0 2.0
group2 0 cat 1 3.0 2.0
1 rat 4 2.0 1.0
2 rat 0 2.0 1.0
3 dog 6 6.0 3.0
4 cat 5 3.0 2.0
或者:
s = df.groupby(['animals', 'groups'])['food'].transform('mean')
df['group_rank'] = s.groupby('groups').transform(lambda x: x.rank(method='dense'))
print (df)
animals food group_rank
groups day
group1 0 cat 1 1.0
1 cat 2 1.0
2 dog 2 3.0
3 dog 5 3.0
4 rat 3 2.0
group2 0 cat 1 2.0
1 rat 4 1.0
2 rat 0 1.0
3 dog 6 3.0
4 cat 5 2.0
感谢Scott Boston改进解决方案:
df['daily_meal'] = df.groupby(['animals', 'groups'])['food'].transform('mean')
df['group_rank'] = df.groupby('groups')['daily_meal'].rank(method='dense')
s = df.groupby(['animals', 'groups'])['food'].transform('mean')
df['group_rank'] = s.groupby('groups').rank(method='dense')
答案 1 :(得分:3)
使用get_level_values
+ transform
+ rank
df.groupby([df.index.get_level_values(level='groups')])['daily_meal '].apply(lambda x : x.rank(method ='dense'))
Out[1068]:
groups day
group1 0 1.0
1 1.0
2 3.0
3 3.0
4 2.0
group2 0 2.0
1 1.0
2 1.0
3 3.0
4 2.0
Name: daily_meal , dtype: float64
分配后
df['group_rank']=df.groupby([df.index.get_level_values(level='groups')])['daily_meal '].apply(lambda x : x.rank(method ='dense'))
df
Out[1070]:
animals food daily_meal group_rank
groups day
group1 0 cat 1 1.5 1.0
1 cat 2 1.5 1.0
2 dog 2 3.5 3.0
3 dog 5 3.5 3.0
4 rat 3 3.0 2.0
group2 0 cat 1 3.0 2.0
1 rat 4 2.0 1.0
2 rat 0 2.0 1.0
3 dog 6 6.0 3.0
4 cat 5 3.0 2.0
以下是我获得daily_meal
df['daily_meal ']=df.groupby([df.index.get_level_values(level='groups'),df.animals])['food'].transform('mean')