假设我们有这个df
d = pd.DataFrame({'year': [2010, 2020, 2010], 'colors': ['red', 'white', 'blue'], "shirt" : ["red shirt", "green and red shirt", "yellow shirt"] })
像这样:
year colors shirt
0 2010 red red shirt
1 2020 white green and red shirt
2 2010 blue yellow shirt
我要过滤掉“衬衫”列中包含“颜色”列的行,同时考虑“年份”列
所需的输出:
year colors shirt
0 2010 red red shirt
我尝试了此d[(d.year == 2010) & (d.shirt.str.contains(d.colors))]
,但遇到此错误:
'Series' objects are mutable, thus they cannot be hashed
这是我正在努力的一个大df。我该如何解决一些熊猫函数?
答案 0 :(得分:2)
我相信您需要df.apply
例如:
df = pd.DataFrame({'year': [2010, 2020, 2010], 'colors': ['red', 'white', 'blue'], "shirt" : ["red shirt", "green and red shirt", "yellow shirt"] })
print(df[(df.year == 2010) & df.apply(lambda x: x.colors in x.shirt, axis=1)])
输出:
year colors shirt
0 2010 red red shirt