(pandas)根据groupby对象中的第一个元素创建新列

时间:2017-03-05 20:56:20

标签: pandas

说我有以下数据框:

     Color Person First Color
0    blue    bob        blue
1   green    jim       green
2  orange    joe      orange
3  yellow    bob        blue
4    pink    jim       green
5  purple    joe      orange

我想创建一个新列,代表每个人看到的第一种颜色:

>>> df['First Color'] = 0
>>> groups = df.groupby(['Person'])['Color']
>>> for g in groups:
...    first_color = g[1].iloc[0]
...    df['First Color'].loc[df['Person']==g[0]] = first_color

我已经找到了一个解决方案,但它似乎效率很低:

{{1}}

是否有更快的方法可以同时执行此操作,而不必遍历groupby对象?

2 个答案:

答案 0 :(得分:6)

transform需要first

print (df.groupby('Person')['Color'].transform('first'))
0      blue
1     green
2    orange
3      blue
4     green
5    orange
Name: Color, dtype: object

df['First_Col'] = df.groupby('Person')['Color'].transform('first')
print (df)
    Color Person First_Col
0    blue    bob      blue
1   green    jim     green
2  orange    joe    orange
3  yellow    bob      blue
4    pink    jim     green
5  purple    joe    orange

答案 1 :(得分:4)

使用transform()方法:

In [177]: df['First_Col'] = df.groupby('Person')['Color'].transform('first')

In [178]: df
Out[178]:
    Color Person First_Col
0    blue    bob      blue
1   green    jim     green
2  orange    joe    orange
3  yellow    bob      blue
4    pink    jim     green
5  purple    joe    orange