Python正则表达式我有一个包含关键字的字符串,但有时关键字不存在,并且它们不在任何特定的oder中。我需要正则表达式的帮助。
关键字是:
Up-to-date
date added
date trained
这些是我需要在许多其他关键字中找到的关键字,它们可能不存在,并且可以按任何顺序排列。
刺痛的样子
<div>
<h2 class='someClass'>text</h2>
blah blah blah Up-to-date blah date added blah
</div>
我尝试了什么:
regex = re.compile('</h2>.*(Up\-to\-date|date\sadded|date\strained)*.*</div>')
regex = re.compile('</h2>.*(Up\-to\-date?)|(date\sadded?)|(date\strained?).*</div>')
re.findall(regex,string)
我正在寻找的结果将是:
If all exists
['Up-to-date','date added','date trained']
If some exists
['Up-to-date','','date trained']
答案 0 :(得分:0)
它必须是正则表达式吗?如果没有,您可以使用find
:
In [12]: sentence = 'hello world cat dog'
In [13]: words = ['cat', 'bear', 'dog']
In [15]: [w*(sentence.find(w)>=0) for w in words]
Out[15]: ['cat', '', 'dog']
答案 1 :(得分:0)
这段代码可以满足您的需求,但它有点气味:
import re
def check(the_str):
output_list = []
u2d = re.compile('</h2>.*Up\-to\-date*.*</div>')
da = re.compile('</h2>.*date\sadded*.*</div>')
dt = re.compile('</h2>.*date\strained*.*</div>')
if re.match(u2d, the_str):
output_list.append("Up-to-date")
if re.match(da, the_str):
output_list.append("date added")
if re.match(dt, the_str):
output_list.append("date trained")
return output_list
the_str = "</h2>My super cool string with the date added and then some more text</div>"
print check(the_str)
the_str2 = "</h2>My super cool string date added with the date trained and then some more text</div>"
print check(the_str2)
the_str3 = "</h2>My super cool string date added with the date trained and then Up-to-date some more text</div>"
print check(the_str3)