HtmlAgilityPack找不到具体的td

时间:2015-02-24 06:44:34

标签: c# xpath html-agility-pack

我需要通过使用XPath从表中提取一个特定td的值,但代码始终返回null。我该如何解决这个问题?

var location = GetLocation(document.Result.DocumentNode.SelectSingleNode("//*[@id='detailTabTable']/tbody/tr[3]/td[2]"));

和代码

private string GetLocation(HtmlNode h)
        {
            try
            {
                string location = null;
                if (h == null)
                {
                    location = "N/A";
                }
                else
                {
                    location = h.InnerText;
                    location = location.Substring(0, location.IndexOf(",", StringComparison.InvariantCulture));
                }
                return location;
            }
            catch (Exception ex)
            {
                log.ErrorFormat("Error in Link Data Repository {0} in Parse Links {1}", ex.Message, ex.StackTrace);
                throw new Exception(ex.Message);
            }
        }

小简单的表:

       <table id="detailTabTable" width="99%" border="0" cellspacing="0" cellpadding="0">
            <tr>
                <td class="detailTabContentLt">Current List Price:</td>
                <td class="detailTabContentPriceRt">
                  <span class="aiDetailCurrentPrice">AED 6,600,000</span>
                </td>
            </tr>
            <tr>
                <td class="detailTabContentLt" style="white-space: nowrap;">Plot size (Sq. Ft.):</td>
                <td class="detailTabContentRt">N/A</td>
            </tr>
            <tr>
                <td class="detailTabContentLt" valign="top">Locality</td>
                <td class="detailTabContentRt">Dubai, Dubai</td>
            </tr>
            <tr>
                <td colspan="2"></td>
            </tr>
        </table>

2 个答案:

答案 0 :(得分:1)

我刚刚测试了你的代码。正如您在xpath表达式中删除 tbody时的评论中所提到的,一切正常。这很好用 我

private static void htmlAgilityPackTest()
{
    string html = " <table id=\"detailTabTable\" width=\"99%\" border=\"0\" cellspacing=\"0\" cellpadding=\"0\"><tr><td class=\"detailTabContentLt\">Current List Price:</td><td class=\"detailTabContentPriceRt\"><span class=\"aiDetailCurrentPrice\">AED 6,600,000</span></td> </tr><tr> <td class=\"detailTabContentLt\" style=\"white-space: nowrap;\">Plot size (Sq. Ft.):</td><td class=\"detailTabContentRt\">N/A</td></tr> <tr><td class=\"detailTabContentLt\" valign=\"top\">Locality</td> <td class=\"detailTabContentRt\">Dubai, Dubai</td> </tr> <tr><td colspan=\"2\"></td> </tr>  </table>";
    HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument();
    document.LoadHtml(html);

    var node = document.DocumentNode.SelectSingleNode("//*[@id='detailTabTable']/tr[3]/td[2]");
    string location = GetLocation(node);
    Console.WriteLine("Location: " + location);
}

如果我误解了什么,请告诉我。

答案 1 :(得分:0)