Rextester

Question

我有

形式的文字

hello world

我想根据span或hello文字尝试匹配其中一个world代码。我尝试了以下形式：

(<span.*?)(?=world).*?<\/span>

使用lookahead，但它匹配整个字符串，而不仅仅是我正在寻找的world。如何以非贪婪的方式匹配<span...之前的world文字？

Answer 1

您可以尝试以下正则表达式：

(<span[^>]*>)world.*?<\/span>

这是一个使用此正则表达式的Python代码段：

input = "<span style=\"color:red;\">hello</span> <span style=\"color:green;\">world</span>"

matchObj = re.match( r'.*(<span[^>]*>)world.*?</span>.*', input, re.M|re.I)

if matchObj:
    print "matchObj.group() : ", matchObj.group()
    print "matchObj.group(1) : ", matchObj.group(1)
else:
   print "No match!!"

请注意，在Python代码中，我必须将.*添加到原始模式的开头和结尾，因为似乎Python正则表达式引擎坚持将模式与整个字符串进行匹配。可能有一个标志可以避免这种情况，但无论如何，希望这个答案能让你解开并让你继续工作。

在这里演示：

使用正则表达式

1 个答案:

Rextester