我有比赛结果的数据框,我试图看看比赛的获胜者是否来自与比赛相同的位置。
round_loc列:
0 Val d'Allos, France
168 Les Deux Alpes, France
378 Winter Park, CO, USA
499 Whistler, BC, Canada
...
国家/地区栏目:
0 France
168 France
378 France
499 Australia
602 France
...
我的代码:
winners_df = df.loc[df['finish_position'] == 1, ['country', 'round_loc']]
hometown_win = winners_df['country'].isin(winners_df['round_loc'])
# Also tried
hometown_win = winners_df['country'].isin(winners_df['round_loc'].values)
print(hometown_win)
我的结果:
0 False
168 False
378 False
499 False
602 False
...
不确定我做错了什么。
winners_df['country'][0] in winners_df['round_loc'][0]
工作正常。我确定我可以用循环来做,但我觉得我在这里遗漏了一些东西。
答案 0 :(得分:1)
print (winners_df)
round_loc country
0 Val d'Allos, France France
168 Les Deux Alpes, France USA <-changed data sample
378 Winter Park, CO, USA France
499 Whistler, BC, Canada Australia
如果需要,请检查列round_loc
中的列是否为country
列中的一个值:
a = '|'.join(winners_df['country'].unique().tolist())
print (a)
France|USA|Australia
hometown_win = winners_df['round_loc'].str.contains(a)
print(hometown_win)
0 True
168 True
378 True
499 False
Name: round_loc, dtype: bool
如果需要,请检查列round_loc
中的列是否为country
列中的一个值,但是每行:
hometown_win = winners_df.apply(lambda x: x['country'] in x['round_loc'],axis=1)
print(hometown_win)
0 True
168 False
378 False
499 False
dtype: bool