我需要一个与单词的第一次出现匹配的正则表达式模式,不包含在'a'标记中,但可以包含在任何其他内容中标签。
即。否定前瞻以查看匹配单词是否在'标记内,如果是,请忽略并继续寻找有效匹配。
示例字符串
有效负载1:
<p>Sample 1 <a href="shouldNotMatchWrappedInA">wordToMatch</a> some random text
to not be matched followed by wordToMatch, this should work.</p>
预期结果1:
wordToMatch ("Not the one inside of a' tags but the following one")
有效负载2:
<p>Sample 2 <a href="shouldNotMatchWrappedInA">wordToMatch</a> some random text
to not be matched followed by <b>wordToMatch</b> this should work.</p>
预期结果2:
wordToMatch ("The one inside of the b' tags")
有效负载3:
<p>Sample 3 <a href="shouldNotMatchWrappedInA">wordToMatch</a> some
random text to not be matched followed by wordToMatch followed by
further occurrences of wordToMatch which should not be matched.</p>
预期结果3:
wordToMatch ("The second occurrence of the term")
请帮忙:'(
使用的语言是Java
答案 0 :(得分:0)
我能想到的简单模式是:
(?:<a.*>)(\w+)(?:<\/a>)
为了测试,请运行perl脚本:
$result = "<p>Sample 1 <a href=\"shouldNotMatchWrappedInA\">wordToMatch</a> some random text to not be matched followed by <b>wordToMatch</b>, this should work.</p>";
$result =~ m/(?:<a.*>)(\w+)(?:<\/a>).*(\1).*/;
print $2;
注意你需要使用第二个匹配的组。 不幸的是我不能在JAVA给你答案。