Question

我想在C＃中使用Html Agility Pack获取每个<li>文本和链接值，并且我还要从每个{{1}的网页中获取<div>和<h1>的值}链接 abc.com/one.html ， abc.com/two.html ＆amp;的 abc.com/three.html

我在运行时收到此错误：

System.ArgumentNullException：值不能为空。

<li>

HTML

[ArgumentNullException: Value cannot be null. Parameter name: second]
System.Linq.Enumerable.Zip(IEnumerable`1 first, IEnumerable`1 second, Func`3 resultSelector) +2619657
System.Web.Util.CalliEventHandlerDelegateProxy.Callback(Object sender, EventArgs e) +51
System.Web.UI.Control.OnLoad(EventArgs e) +95
System.Web.UI.Control.LoadRecursive() +59
System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +678

C＃

<ul>
    <li>ListOne<a href="abc.com/one.html"></a></li>
    <li>ListTwo<a href="abc.com/two.html"></a></li>
    <li>ListThree<a href="abc.com/three.html"></a></li>  
</ul>

*注意：此代码适用于网页某些元素的xPath： abc.com/one.html ， abc.com/two.html 和 abc.com/three.html

Answer 1

得到了解决方案。一些xPath返回null值。所以我只增加一行代码。

string Url = "WebAddress1"; 
HtmlWeb web = new HtmlWeb(); 
HtmlDocument doc = web.Load(Url); 
if (doc.DocumentNode.SelectNodes("//*[@id=\"pageContent\"]/ul[1‌]/li") != null && doc.DocumentNode.SelectNodes("//*[@id=\"pageContent\"]/ul[1]‌/li/a") != null)//Added this line 
{ foreach (...) { // } }

C＃和Html Agility Pack：嵌套foreach循环中的Null异常

1 个答案: