说我有以下数据框:
Color Person First Color
0 blue bob blue
1 green jim green
2 orange joe orange
3 yellow bob blue
4 pink jim green
5 purple joe orange
我想创建一个新列,代表每个人看到的第一种颜色:
>>> df['First Color'] = 0
>>> groups = df.groupby(['Person'])['Color']
>>> for g in groups:
... first_color = g[1].iloc[0]
... df['First Color'].loc[df['Person']==g[0]] = first_color
我已经找到了一个解决方案,但它似乎效率很低:
{{1}}
是否有更快的方法可以同时执行此操作,而不必遍历groupby对象?
答案 0 :(得分:6)
print (df.groupby('Person')['Color'].transform('first'))
0 blue
1 green
2 orange
3 blue
4 green
5 orange
Name: Color, dtype: object
df['First_Col'] = df.groupby('Person')['Color'].transform('first')
print (df)
Color Person First_Col
0 blue bob blue
1 green jim green
2 orange joe orange
3 yellow bob blue
4 pink jim green
5 purple joe orange
答案 1 :(得分:4)
使用transform()方法:
In [177]: df['First_Col'] = df.groupby('Person')['Color'].transform('first')
In [178]: df
Out[178]:
Color Person First_Col
0 blue bob blue
1 green jim green
2 orange joe orange
3 yellow bob blue
4 pink jim green
5 purple joe orange