Question

我想从一个正则表达式中提取网址（不是所有网址，只有一些通过我的正则表达式）。

我尝试了Regex.Match

string html = request.Get(
    "http://www.bing.com/search?q=" + keyword + "&first=1"
).ToString();
Match urls = Regex.Match(html, "<h2><a href=\"(.*?)\"");

它只显示一个URL，我想拥有所有URL

编辑：对于遇到此问题的人，这是解决方案

string pattern = @"<a href=""([^""]+)";
                                Regex rgx = new Regex(pattern);

                                foreach (Match match in rgx.Matches(html))
                                    Console.WriteLine("Found '{0}' at position {1}", match.Value, match.Index);

Answer 1

要获取所有URL，您需要删除<h2>标签。

尝试模式：<a href="([^"]+)

说明：

<a href="-字面上匹配<a href="

([^"]+)-匹配"以外的一个或多个字符并将其存储到第一个捕获组中

要获取所有URL，您需要调用Matches方法，然后使用Groups属性遍历它们：

foreach(var match in Regex.Matches(html, "<a href=\"([^\"]+)")
{
  // get url from first capturing group
  string url = match.Groups[1];
  // ...
}

如何从正则表达式中提取

1 个答案: