我有一张桌子,我喜欢阅读此表格中的所有链接并将其拆分为组。
目标应该是
所以我必须获取此表中链接的值
<a class="darklink" href="testlink">Person 2, - Status of Person 2</a>
是否可以在之前只搜索具有特定
标签的表格?像这样
<p>title</p>
(因为我的网站上还有其他类似的表格)
<p>title</p>
<table cellspacing="0" cellpadding="0" border="0" width="95%">
<tbody>
<tr>
<td bgcolor="#999999" colspan="2"><img height="1" border="0" width="1" src="images/dot_transp.gif" alt=" "/> </td>
</tr>
<tr>
<td><a class="darklink" href="asdfer">Person1, - Status of Person1 </a> </td>
<td valign="bottom"></td>
</tr>
<tr>
<td bgcolor="#999999" colspan="2"><img height="1" border="0" width="1" src="images/dot_transp.gif" alt=" "/> </td>
</tr>
<tr>
<td><a class="darklink" href="aeraseraesr">Person 2, - Status of Person 2</a></td>
<td valign="bottom"><a href="aeraeraer"> <img hspace="0" height="16" border="0" align="right" width="12" vspace="0" alt=" " src="images/ico_link.gif"/> </a> </td>
</tr>
<tr>
<td bgcolor="#999999" colspan="2"><img height="1" border="0" width="1" src="images/dot_transp.gif" alt=" "/> </td>
</tr>
<tr>
<td><a class="darklink" href="asdfasdf">Person 3. - Status of Person 3</a></td>
<td valign="bottom"><a href="aerere"> </a> </td>
</tr>
<tr> </tr>
</tbody>
</table>
答案 0 :(得分:2)
你的正则表达式应该是:
<a class="darklink" .*?>(.*?). - (.*?)</a>
或者如果您的<a>
代码中包含换行符:
<a class="darklink" [\s\S]*?>([\s\S]*?). - *([\s\S]*?)</a>
因此,以下代码应该有效:
Regex person = new Regex(@"<a class=""darklink"" .*?>(.*?). - (.*?)</a>");
foreach (Match m in person.Matches(input))
{
Console.WriteLine("First group : {0}", m.Groups[1]);
Console.WriteLine("Second group: {0}", m.Groups[2]);
};