我有一个非常好的正则表达式,并且能够将字符串中的url替换为可单击一次。
string regex = @"((www\.|(http|https|ftp|news|file)+\:\/\/)[_.a-z0-9-]+\.[a-z0-9\/_:@=.+?,##%&~-]*[^.|\'|\# |!|\(|?|,| |>|<|;|\)])";
现在,我如何告诉它忽略已经可点击的链接和图像?
所以它忽略了下面的字符串:
<a href="http://www.someaddress.com">Some Text</a>
<img src="http://www.someaddress.com/someimage.jpg" />
示例:
The website www.google.com, once again <a href="http://www.google.com">www.google.com</a>, the logo <img src="http://www.google.com/images/logo.gif" />
结果:
The website <a href="http://www.google.com">www.google.com</a>, once again <a href="http://www.google.com">www.google.com</a>, the logo <img src="http://www.google.com/images/logo.gif" />
完整的HTML解析器代码:
string regex = @"((www\.|(http|https|ftp|news|file)+\:\/\/)[_.a-z0-9-]+\.[a-z0-9\/_:@=.+?,##%&~-]*[^.|\'|\# |!|\(|?|,| |>|<|;|\)])";
Regex r = new Regex(regex, RegexOptions.IgnoreCase);
text = r.Replace(text, "<a href=\"$1\" title=\"Click to open in a new window or tab\" target=\"_blank\" rel=\"nofollow\">$1</a>").Replace("href=\"www", "href=\"http://www");
return text;
答案 0 :(得分:2)
首先,如果没有其他人愿意,我会发布强制性链接。 RegEx match open tags except XHTML self-contained tags
如何使用"
这样的负前瞻/后退:
string regex = @"(?<!"")((www\.|(http|https|ftp|news|file)+\:\/\/)[_.a-z0-9-]+\.[a-z0-9\/_:@=.+?,##%&~-]*[^.|\'|\# |!|\(|?|,| |>|<|;|\)])(?!"")";
答案 1 :(得分:1)
退房:Detect email in text using regex,只需替换链接的正则表达式,它永远不会替换标记内的链接,只会在内容中。
http://htmlagilitypack.codeplex.com/
类似的东西:
string textToBeLinkified = "... your text here ...";
const string regex = @"((www\.|(http|https|ftp|news|file)+\:\/\/)[_.a-z0-9-]+\.[a-z0-9\/_:@=.+?,##%&~-]*[^.|\'|\# |!|\(|?|,| |>|<|;|\)])";
Regex urlExpression = new Regex(regex, RegexOptions.IgnoreCase | RegexOptions.ExplicitCapture);
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(textToBeLinkified);
var nodes = doc.DocumentNode.SelectNodes("//text()[not(ancestor::a)]") ?? new HtmlNodeCollection();
foreach (var node in nodes)
{
node.InnerHtml = urlExpression.Replace(node.InnerHtml, @"<a href=""$0"">$0</a>");
}
string linkifiedText = doc.DocumentNode.OuterHtml;