我有以下内容来检测和替换链接:
// need to find anchors
Regex urlRx = new Regex(@"((https?|ftp|file)\://|www.)[A-Za-z0-9\.\-]+(/[A-Za-z0-9\?\#\&\=;\+!'\(\)\*\-\._~%]*)*", RegexOptions.IgnoreCase);
MatchCollection matches = urlRx.Matches(source);
foreach (Match match in matches)
{
source = source.Replace(match.Value, "<a target=\"_blank\" href='" + match.Value + "'>" + match.Value + "</a>");
}
但是当source
包含一个锚时,这并不是很有效,因为它用另一个锚替换已经存在的锚的内部。我怎样才能防止这种情况发生?
示例i / o:
http://www.google.com -> <a target="blank"> href="http://www.google.com">http://www.google.com</a>
Pre-existing anchors (<a></a>) -> unchanged
我认为阻止匹配任何前面带有非空格字符(或引用"
)的网址都是有效的,但我不知道该怎么做。
答案 0 :(得分:1)
您只需检查是否已有预先存在的锚
Regex urlRx = new Regex(@"((https?|ftp|file)\://|www.)[A-Za-z0-9\.\-]+(/[A-Za-z0-9\?\#\&\=;\+!'\(\)\*\-\._~%]*)*", RegexOptions.IgnoreCase);
MatchCollection matches = urlRx.Matches(source);
var rxAnchor = new Regex("<a [^>]*href=(?:'(?<href>.*?)')|(?:\"(?<href>.*?)\")", RegexOptions.IgnoreCase);
foreach (Match match in matches)
{
List<string> urls = rxAnchor.Matches(source).OfType<Match>().Select(m => m.Groups["href"].Value).ToList();
if (urls != null && urls.Count() > 0)
{
string urlToAppend = urls[0];
// DO Your Stuff here
}
else
{
source = source.Replace(match.Value, "<a target=\"_blank\" href='" + match.Value + "'>" + match.Value + "</a>");
}
}