使用HTMLAgilityPack从HTML解析数据

时间:2013-10-31 08:17:52

标签: c# html parsing html-agility-pack

从下面的源代码中,我想提取InnerText“我的名字”,但我可以单独输出1个href节点,而是获得整个href列表:

<tr class="index">
  <td class="number">1.</td>
  <td class="image">
    <a href="/image/520211/" title="My index">
    <img src=" /images/M/MV5MDE.jpg" height="74" width="54" alt="My Alt" title="My Title">
    </a>
  </td>
  <td class="name">
    <span class="name_wrapper" data-size="small" data-caller-name="search">
    </span>
    <a href="/data/520211/">My Name</a>
    <span class="year">1974</span>
  </td>
</tr>

到目前为止我的代码:

for (var index = 0; index < htmlDocument.DocumentNode.SelectNodes("//tr[@class=index']//a[@href]").Count; index++)
{
    var item = htmlDocument.DocumentNode.SelectNodes("//tr[@class=index']//a[@href]")[index];
    MessageBox.Show(item.InnerText);
}

1 个答案:

答案 0 :(得分:0)

试试这个:

string name = "";
var node = htmlDocument.DocumentNode
    .SelectSingleNode("//tr[@class='index']//td[@class='name']//a[@href]");
if (node != null)
    name = node.InnerText;