我需要通过使用XPath从表中提取一个特定td的值,但代码始终返回null。我该如何解决这个问题?
var location = GetLocation(document.Result.DocumentNode.SelectSingleNode("//*[@id='detailTabTable']/tbody/tr[3]/td[2]"));
和代码
private string GetLocation(HtmlNode h)
{
try
{
string location = null;
if (h == null)
{
location = "N/A";
}
else
{
location = h.InnerText;
location = location.Substring(0, location.IndexOf(",", StringComparison.InvariantCulture));
}
return location;
}
catch (Exception ex)
{
log.ErrorFormat("Error in Link Data Repository {0} in Parse Links {1}", ex.Message, ex.StackTrace);
throw new Exception(ex.Message);
}
}
小简单的表:
<table id="detailTabTable" width="99%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td class="detailTabContentLt">Current List Price:</td>
<td class="detailTabContentPriceRt">
<span class="aiDetailCurrentPrice">AED 6,600,000</span>
</td>
</tr>
<tr>
<td class="detailTabContentLt" style="white-space: nowrap;">Plot size (Sq. Ft.):</td>
<td class="detailTabContentRt">N/A</td>
</tr>
<tr>
<td class="detailTabContentLt" valign="top">Locality</td>
<td class="detailTabContentRt">Dubai, Dubai</td>
</tr>
<tr>
<td colspan="2"></td>
</tr>
</table>
答案 0 :(得分:1)
我刚刚测试了你的代码。正如您在xpath表达式中删除 tbody
时的评论中所提到的,一切正常。这很好用
我
private static void htmlAgilityPackTest()
{
string html = " <table id=\"detailTabTable\" width=\"99%\" border=\"0\" cellspacing=\"0\" cellpadding=\"0\"><tr><td class=\"detailTabContentLt\">Current List Price:</td><td class=\"detailTabContentPriceRt\"><span class=\"aiDetailCurrentPrice\">AED 6,600,000</span></td> </tr><tr> <td class=\"detailTabContentLt\" style=\"white-space: nowrap;\">Plot size (Sq. Ft.):</td><td class=\"detailTabContentRt\">N/A</td></tr> <tr><td class=\"detailTabContentLt\" valign=\"top\">Locality</td> <td class=\"detailTabContentRt\">Dubai, Dubai</td> </tr> <tr><td colspan=\"2\"></td> </tr> </table>";
HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument();
document.LoadHtml(html);
var node = document.DocumentNode.SelectSingleNode("//*[@id='detailTabTable']/tr[3]/td[2]");
string location = GetLocation(node);
Console.WriteLine("Location: " + location);
}
如果我误解了什么,请告诉我。
答案 1 :(得分:0)
你可以使用fizzler并选择CSS方式:) http://blog.simontimms.com/2014/02/24/parsing-html-in-c-using-css-selectors/