我有这个数据集,
sender team_id receiver
John Cena 1 Margaret
Genghis Khan 2 Mahathma
Mahathma Gandhi 1 John
John Doe 2 Genghis
Margaret Thatcher 1 John
每个发送者都有一个团队ID,接收者的名字只是他们的名字。我想找出每个消息是否在团队成员之间。结果看起来像这样。
sender team_id receiver btwn_teammates
John Cena 1 Margaret Yes
Genghis Khan 2 Mahathma No
Mahathma Gandhi 1 John Yes
John Doe 2 Genghis Yes
Margaret Thatcher 1 John Yes
答案 0 :(得分:2)
合并名称+ team_id的第一部分,然后映射指标值:
df2 = df[['sender', 'team_id']].rename(columns={'sender': 'receiver'})
df2['receiver'] = df2.receiver.str.split().str[0]
df2 = df2.drop_duplicates() # So left merge preserves size.
df = df.merge(df2, how='left', indicator='btwn_team')
df['btwn_team'] = df.btwn_team.map({'both': 'Yes', 'left_only': 'No'})
输出:
sender team_id receiver btwn_team
0 John Cena 1 Margaret Yes
1 Genghis Khan 2 Mahathma No
2 Mahathma Gandhi 1 John Yes
3 John Doe 2 Genghis Yes
4 Margaret Thatcher 1 John Yes