Question

假设我们有这个df

d = pd.DataFrame({'year': [2010, 2020, 2010], 'colors': ['red', 'white', 'blue'], "shirt" : ["red shirt", "green and red shirt", "yellow shirt"] })

像这样：

    year    colors  shirt
0   2010    red     red shirt
1   2020    white   green and red shirt
2   2010    blue    yellow shirt

我要过滤掉“衬衫”列中包含“颜色”列的行，同时考虑“年份”列

所需的输出：

year    colors  shirt
    0   2010    red     red shirt

我尝试了此d[(d.year == 2010) & (d.shirt.str.contains(d.colors))]，但遇到此错误：

'Series' objects are mutable, thus they cannot be hashed

这是我正在努力的一个大df。我该如何解决一些熊猫函数？

Answer 1

我相信您需要df.apply

例如：

df = pd.DataFrame({'year': [2010, 2020, 2010], 'colors': ['red', 'white', 'blue'], "shirt" : ["red shirt", "green and red shirt", "yellow shirt"] })
print(df[(df.year == 2010) & df.apply(lambda x: x.colors in x.shirt, axis=1)])

输出：

   year colors      shirt
0  2010    red  red shirt

如果A行中的字符串包含b行元素，则使用熊猫过滤器

1 个答案: