正则表达式删除一些文本的超链接

时间:2011-09-17 15:35:03

标签: c# regex

click <a href="javascript:validate('http://www.google.com');">here</a> to open google.com

我需要将以上句子替换为以下内容:

click <a href="http://www.google.com">here</a> to open google.com

请帮我使用正则表达式在C#中执行此操作

5 个答案:

答案 0 :(得分:1)

 Regex regex = new Regex ("href\=\".+?'(.+)'", 
            RegexOptions.IgnoreCase);
        MatchCollection matches = regex.Matches(text);

然后你需要提取第1组:

matches .Groups[1]

这是你要分配的新值。

答案 1 :(得分:1)

你走了:

正则表达式:

(?<=href\=")(javascript:validate\('(?<URL>[^"']*)'\);)

守则:

string url = "click <a href=\"javascript:validate('http://www.google.com');\">here</a> to open google.com";
Regex regex = new Regex("(?<=href\\=\")javascript:validate\\('(?<URL>[^\"']*)'\\);");
string output = regex.Replace(url, "${URL}");

输出:

click <a href="http://www.google.com">here</a> to open google.com

答案 2 :(得分:1)

不需要正则表达式:

var s = 
    inputString.Replace(
        "javascript:validate('http://www.google.com');",
        "http://www.google.com" );

答案 3 :(得分:0)

以Austin的方式解析HTML是一种更有效的方法,但是如果你绝对必须使用REGEX,请尝试这样的事情(referenced from MSDN System.Text.RegularExpressions Namespace):

using System;
using System.Text.RegularExpressions;

class MyClass
{
    static void Main(string[] args)
    {
        string pattern = @"<a href=\"[^\(]*\('([^']+)'\);\">";
        Regex r = new Regex(pattern, RegexOptions.IgnoreCase);
        string sInput = "click <a href=\"javascript:validate('http://www.google.com');\">here</a> to open google.com";

        MyClass c = new MyClass();

        // Assign the replace method to the MatchEvaluator delegate.
        MatchEvaluator myEvaluator = new MatchEvaluator(c.ReplaceCC);

        // Write out the original string.
        Console.WriteLine(sInput);

        // Replace matched characters using the delegate method.
        sInput = r.Replace(sInput, myEvaluator);

        // Write out the modified string.
        Console.WriteLine(sInput);
    }

    // Replace each Regex cc match
    public string ReplaceCC(Match m)
    {
        return "click <a href=\"" + m.Group[0] + "\">";
    }
}

答案 4 :(得分:0)

HtmlAgilityPack:http://htmlagilitypack.codeplex.com

这是解析HTML的首选方法。