我有这个数据框:
word, string1, string2
SQL, SQL is good, Programming
Java, Programming, Java is good
C#, Programming, Programming
如果我的列字值出现在列string1上,则我有一个给出布尔值的列:
data['res'] = data.apply(lambda x: x.word in x.string1
if (x.string1 == x.string1) and (x.word == x.word)
else False)
但是我想看看列字值是否出现在string1列或string2中? 像这样:
data['res'] = data.apply(lambda x: x.word in x.string1
if (x.string1 == x.string1) and (x.word == x.word)
else (x.word in x.string2
if (x.string2 == x.string2) and (x.word == x.word))axis=1)
else False)
我想要的是:
word, string1, string2, res
SQL, SQL is good, Programming, True
Java, Programming, Java is good, True
C#, Programming, Programming, False
这可能吗?
谢谢!
答案 0 :(得分:1)
您需要检查第1列中的字符串是否存在于其他任何列中,且any()
的第1轴位于上方:
df.apply(lambda x:x.str.contains(x.word),axis=1).iloc[:,1:].any(axis=1)
0 True
1 True
2 False
完整代码:
df=df.assign(res=df.apply(lambda x:x.str.contains(x.word),axis=1).iloc[:,1:].any(axis=1))
word string1 string2 res
0 SQL SQL is good Programming True
1 Java Programming Java is good True
2 C# Programming Programming False
答案 1 :(得分:1)
最简单的方法是同时连接两列并添加另一个过滤器:
data['res'] = data.apply(lambda x: x.word in x.string1 + x.string2
if (x.string1 == x.string1 ) and
(x.word == x.word) and
(x.string1 == x.string1)
else False, axis=1)