我有这个数据框:
user_id event
1 registration
1 visit
1 purchase
2 registration
2 external
我想从每个具有至少一个外部类型事件的user_id中选择所有值,以便将它们存储在另一个数据框中。因此,对于这种情况,我希望有一个包含以下信息的数据框:
user_id event
2 registration
2 external
我该怎么办?
答案 0 :(得分:4)
您可以计算范围内的用户ID。然后应用布尔掩码:
# calculate unique in-scope IDs
ids = df.loc[df['event'] == 'external', 'user_id'].unique()
# mask dataframe with Boolean series
res = df[df['user_id'].isin(ids)]
print(res)
user_id event
3 2 registration
4 2 external
答案 1 :(得分:1)
您需要
#find the user ids wchich has `external`
user=df[df['event'].isin(['external'])]['user_id']
#df, which has users
df[df['user_id'].isin(user)]