考虑这一小段文字:
@"
I want to match the word 'highlight' in a string. But I don't want to match
highlight when it is contained in an HTML anchor element. The expression
should not match highlight in the following text: <a href='#'>highlight</a>
"
这是输出应该是什么样的(匹配以粗体显示):
我想要匹配这个词 字符串中的“突出显示”。但是我 不想匹配 当它包含在HTML锚元素中时,突出显示。表达方式 不应该匹配突出显示 以下文字: highlight
如何构建一个匹配所有 X 的表达式,不包括HTML锚元素中的匹配项?
答案 0 :(得分:2)
我知道你要求RegEx,但我不会这样做。相反,这是使用Html Agility Pack的解决方案。
public static void Parse()
{
string htmlFragment =
@"
I want to match the word 'highlight' in a string. But I don't want to match
highlight when it is contained in an HTML anchor element. The expression
should not match highlight in the following text: <a href='#'>highlight</a> more
";
HtmlDocument htmlDocument = new HtmlAgilityPack.HtmlDocument();
htmlDocument.LoadHtml(htmlFragment);
foreach (HtmlNode node in htmlDocument.DocumentNode.SelectNodes("//.").Where(FilterTextNodes()))
{
Console.WriteLine(node.OuterHtml);
}
}
private static Func<HtmlNode, bool> FilterTextNodes()
{
return node => node.NodeType == HtmlNodeType.Text && node.ParentNode != null && node.ParentNode.Name != "a" && node.OuterHtml.Contains("highlight");
}