Group and rename pandas dataframe

时间:2018-03-25 20:27:18

标签: python pandas dataframe

In Pythons Pandas, I have a dataframe where one column holds a group called "code" and another column holds notes for that group. Each occurrence of those groups may have different notes.
How to rename the groups by selecting the first occurrence of the note in that group?

Example:
IN:

CODE   NOTE
A      Banana
B      Cola
A      Apple
B      Fanta
C      Toy

Out:

CODE     NOTE
Banana   Banana
Cola     Cola
Banana   Apple
Cola     Fanta
Toy      Toy

So far, I have this code to group and display code, count, and note:

df.groupby('code').note.agg(['count', 'first']).sort_values('count', ascending=False)

1 个答案:

答案 0 :(得分:2)

Call drop_duplicates and then map NOTE to CODE:

df['CODE'] = df.CODE.map(df.drop_duplicates('CODE').set_index('CODE').NOTE)

Or,

df['CODE'] = df.CODE.replace(df.drop_duplicates('CODE').set_index('CODE').NOTE)

Alternatively,

mapper = df.drop_duplicates('CODE').set_index('CODE').NOTE.to_dict()
df['CODE'] = df['CODE'].map(mapper)

df

     CODE    NOTE
0  Banana  Banana
1    Cola    Cola
2  Banana   Apple
3    Cola   Fanta
4     Toy     Toy

Note; map is magnitudes of order faster than replace, but both of them work the same.