我有以下df:
country sport score
0 ita swim 15
1 fr run 25
2 ger golf 37
3 ita run 17
4 fr golf 58
5 fr run 35
我仅对类别的某些元素感兴趣:
ctr = ['ita','fr']
sprt= ['run','golf']
我希望这样提取它们:
df[(df['country']== x for x in ctr)&(df['sport']== x for x in sprt)]
但是当它没有引发任何错误时,它返回空值。
有什么建议吗? 我也尝试过:df[(df['country']== {x for x in ctr})&(df['sport']== {x for x in sprt})]
编辑:
我想使用循环的原因是cos,我实际上对每种组合的3个最高分感兴趣,我希望能做到这一点:
df1 = pd.concat(df[(df['country']== x for x in ctr)&(df['sport']== x for x in sprt)].sort_values(by=['score'],ascending=False).head(3))
答案 0 :(得分:2)
使用双Series.isin
作为支票会员:
df1 = df[(df['country'].isin(ctr))&(df['sport'].isin(sprt))]
print (df1)
country sport score
1 fr run 25
3 ita run 17
4 fr golf 58
5 fr run 35
df2 = df1.sort_values('score', ascending=False).groupby(['country','sport']).head(3)
print (df2)
country sport score
4 fr golf 58
5 fr run 35
1 fr run 25
3 ita run 17