C#和Html Agility Pack:嵌套foreach循环中的Null异常

时间:2017-02-10 15:28:12

标签: c# html-agility-pack

我想在C#中使用Html Agility Pack获取每个<li>文本和链接值,并且我还要从每个{{1}的网页中获取<div><h1>的值}链接 abc.com/one.html abc.com/two.html &amp;的 abc.com/three.html

我在运行时收到此错误:

  

System.ArgumentNullException:值不能为空。

<li>

HTML

[ArgumentNullException: Value cannot be null. Parameter name: second]
System.Linq.Enumerable.Zip(IEnumerable`1 first, IEnumerable`1 second, Func`3 resultSelector) +2619657
System.Web.Util.CalliEventHandlerDelegateProxy.Callback(Object sender, EventArgs e) +51
System.Web.UI.Control.OnLoad(EventArgs e) +95
System.Web.UI.Control.LoadRecursive() +59
System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +678 

C#

<ul>
    <li>ListOne<a href="abc.com/one.html"></a></li>
    <li>ListTwo<a href="abc.com/two.html"></a></li>
    <li>ListThree<a href="abc.com/three.html"></a></li>  
</ul>

*注意:此代码适用于网页某些元素的xPath: abc.com/one.html abc.com/two.html abc.com/three.html

1 个答案:

答案 0 :(得分:0)

得到了解决方案。一些xPath返回null值。所以我只增加一行代码。

string Url = "WebAddress1"; 
HtmlWeb web = new HtmlWeb(); 
HtmlDocument doc = web.Load(Url); 
if (doc.DocumentNode.SelectNodes("//*[@id=\"pageContent\"]/ul[1‌​]/li") != null && doc.DocumentNode.SelectNodes("//*[@id=\"pageContent\"]/ul[1]‌​/li/a") != null)//Added this line 
{ foreach (...) { // } }