我想验证“说明”列中是否存在子字符串。如果为“ true”,则在“结果”列中写一些内容。 如果我有一个条件,我的代码就可以工作
<button onclick='CreateModal()'>LOGIN </button>
和
df.loc[df.index[df.description.str.contains('ab',flags=re.I, regex=True)],'result']='found ab'
但不适用于“与”条件
df.loc[df.index[df.description.str.contains('d|f',flags=re.I, regex=True)],'result']='found d or f'
如果我这样写,就可以了,但是太长了
df.loc[df.index[df.description.str.contains('d&f',flags=re.I, regex=True)],'result']='found d and f'
最后,对于以下情况,是否有更好的代码?
df.loc[(df.index[df.description.str.contains('d',flags=re.I, regex=True)] & df.index[df.description.str.contains('f',flags=re.I, regex=True))] ,'result']='found d&f'
答案 0 :(得分:0)
要匹配AND
条件,可以使用以下正则表达式:
(?:d)\w*f|(?:f)\w*d
详细信息:
(?:d)
:非捕获组-逐字匹配字符d
\w*
:0+个字母/数字/下划线f
:从字面上匹配字符f
|
:或(或查找f
在d
之前的时间)(?:f)
:非捕获组-逐字匹配字符f
\w*
:0+个字母/数字/下划线d
:从字面上匹配字符d
import pandas as pd
import re
df = pd.DataFrame(
{"description": ["abc", "def", "hjk", "lmno", "dxx", "fxx", "fxd"]}
)
reg_list = [
("ab", "found ab"),
("d|f", "found d OR f"),
("(?:d)\w*f|(?:f)\w*d", "found d AND f"),
("l|m|n|o", "found l|m|n|o"),
]
for r in reg_list:
df.loc[df.index[df.description.str.contains(r[0], flags=re.I, regex=True)], 'result'] = r[1]
print(df)
description result
0 abc found ab
1 def found d AND f
2 hjk NaN
3 lmno found l|m|n|o
4 dxx found d OR f
5 fxx found d OR f
6 fxd found d AND f