使用正则表达式和正则表达式查找字符串

时间:2015-05-05 23:51:20

标签: c# regex

我有这个文字,我尝试打印a1a2

<a href="a1" title="t1"> k1 </a>
<a href="a2" title="t2"> k2 </a>

这是我的尝试:

string html =  "<a href=\"a1\" title=\"t1\"> k1 </a>";
       html += "<a href=\"a2\" title=\"t2\"> k2 </a>";

 //here is how I think my logic expression should work:
 //<a href=" [something that is not quote, 0 or more times] " [anything] </a>
Regex regex = new Regex("<a href=\"([^\"]*)\".*</a>");
foreach (Match match in regex.Matches(html)
    Console.WriteLine(match.Groups[1]);

为什么只打印a1?我很确定我做得对。我错过了什么?

1 个答案:

答案 0 :(得分:2)

您的正则表达式.*占用了第二个</a>之前的所有字符。您需要的是使用.*?延迟消费,以便它只消耗所有字符,直到第一个</a>

Regex regex = new Regex("<a href=\"([^\"]*)\".*?</a>");

同时, Why it's not possible to use regex to parse HTML/XML: a formal explanation in layman's terms