Question

好吧，我正在尝试使用Regex创建一个方法，该方法将提取与此模式匹配的网址列表@"http://(www\.)?([^\.]+)\.com"，到目前为止，我已经完成了此操作：

public static List<string> Test(string url)
    {
        const string pattern = @"http://(www\.)?([^\.]+)\.com";
        List<string> res = new List<string>();

        MatchCollection myMatches = Regex.Matches(url, pattern);

        foreach (Match currentMatch in myMatches)
        {

        }

        return res;


    }

主要问题是，我应该在foreach循环中使用哪些代码

        res.Add(currentMatch.Groups.ToString());

或

            res.Add(currentMatch.Value);

谢谢！

Answer 1

你只需要获得所有.Match.Value。在您的代码中，您应该使用

res.Add(currentMatch.Value);

或者，只需使用LINQ：

res = Regex.Matches(url, pattern).Cast<Match>()
           .Select(p => p.Value)
           .ToList();

Answer 2

res.Add(currentMatch.Groups.ToString());会给：System.Text.RegularExpressions.GroupCollection所以你没有对它进行测试。

您希望从参数url获得多少匹配？

我会用这个：

static readonly Regex _domainMatcher =  new Regex(@"http://(www\.)?([^\.]+)\.com", RegexOptions.Compiled);

public static bool IsValidDomain(string url)
{
    return _domainMatcher.Match(url).Success;
}

或

public static string ExtractDomain(string url)
{ 
    var match = _domainMatcher.Match(url);
    if(match.Success)
        return match.Value;
    else
        return string.Empty;
}

因为参数名为url，所以它应该是url

如果有更多可能性，并且您想要提取与该模式匹配的所有域名：

public static IEnumerable<string> ExtractDomains(string data)
{
    var result = new List<string>();

    var match = _domainMatcher.Match(data);

    while (match.Success)
    {
        result.Add(match.Value);

        match = match.NextMatch();
    }

    return result;
}

请注意IEnumerable<>而不是List<>，因为调用者无需修改结果。

或者这个懒惰的变体：

public static IEnumerable<string> ExtractDomains(string data)
{
    var match = _domainMatcher.Match(data);

    while (match.Success)
    {
        yield return match.Value;
        match = match.NextMatch();
    }
}

提取与模式

2 个答案: