我有一个数据框,其中包含一列字符串,另一列包含一列字符串。
0 1
0 apples are good [orange, banana]
1 bananas are good [bananas, bad]
2 cucumbers are green [cucumbers, are]
3 grapes are green [grapes, are, green]
4 oranges are good [oranges]
5 pineapples are big [flowers, apples]
我希望找到所有索引,其中Column 0
中的字符串与Column 1
中的所有列表内容相匹配。在这种情况下,输出将如下所示:
0 1
2 cucumbers are green [cucumbers, are]
3 grapes are green [grapes, are, green]
4 oranges are good [oranges]
我知道我可以使用pandas.Series.str.contains
,但这仅适用于单个列表,如果可能的话,我希望避免重复/循环。
答案 0 :(得分:3)
您可以使用列表理解和布尔索引:
res = df[[all(word in x.split() for word in y) for x, y in zip(df[0], df[1])]]
print(res)
0 1
2 cucumbers are green [cucumbers, are]
3 grapes are green [grapes, are, green]
4 oranges are good [oranges]