Question

我的代码如下

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using HtmlAgilityPack;

namespace ConsoleApplication2
   {
     class Program
       {
        static void Main(string[] args)
          {

             HtmlWeb webClient = new HtmlWeb();

             HtmlAgilityPack.HtmlDocument doc = webClient.Load("https://uk.finance.yahoo.com/q/hp?s=0001.HK");

             string date = doc.DocumentNode.SelectSingleNode(@"/html/body/div/div/table/tbody/tr[2]/td[1]/table/tbody/tr/td/table/tbody/tr[2]/td[1]").InnerText;

            Console.Write(date);
            Console.ReadKey();

        }
     }
}

但是Xpath返回null值，我用XPATH帮助器检查，它是正确的（请参见附图）

XPATH

谁能给我一些想法？感谢

Answer 1

你没有说出你真正想要提取的节点，你的XPath我知道你想要表格中的第一列，我是否正确？ XPath返回null的原因有很多，主要原因是您应该使用更通用的XPath。您使用的插件很好，但它显示了一个非常特定的XPath，因此如果页面中最小的东西发生变化，即使是您看不到的东西，XPath也不再有效。需要记住的另一个注意事项是，您的浏览器会更改HTML（例如，我遇到了Chrome所带来的许多差异，尤其是表格会变成Chrome浏览器的Div＆＃39;）。另一个重要的注意事项，当使用id / class名称来提取XPath时，我更喜欢使用＆＃39; contains＆＃39;而不是实际名称，因为如上所述，它们都可以改变。

<强> TL; DR：

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
HtmlNode node = doc.DocumentNode.SelectSingleNode("//table[contains(@class, 'yfnc')]//table//td[contains(@class, 'yfnc_tabledata1')][1]");
if(node != null)
{
  // Extract its InnerText
}

最后一件事，当你提取一个节点时，你应该在尝试提取它的InnerText之前检查它是否为空，或者用“尝试”来包装它。 - ＆＃39;赶上＆＃39;子句，用于捕获NullReferenceException。

C＃htmlagilitypack XPATH返回System.NullReferenceException

1 个答案: