使用HtmlAgilitypack从DIV获取href标记时获取`NullPointerException`

时间:2013-09-30 06:38:32

标签: c# asp.net web-scraping html-agility-pack

以下是我的html片段,其中有N个DIV,类名为quality,现在我想提取每个具有类名称质量的div的<a href>

         Eastin Easy Citizen Ahmedab​​ad

<div class="quality wrap">
<a href="/Hotel_Review-g297608-d4464287-Reviews-Eastin_Easy_Citizen_Ahmedabad-Ahmedabad_Gujarat.html" id="property_4464287"class="property_title" onclick=" ta.setEvtCookie('Reviews', 'HotelName', 297608, 0, this.href); ta.util.cookie.setPIDCookie(15176);">
Eastin Easy Citizen Ahmedabad</a> </div>

<div class="quality wrap">
<a href="/Hotel_Review-g297608-d4464287-Reviews-Eastin_Easy_Citizen_Ahmedabad-Ahmedabad_Gujarat.html" id="property_4464287"class="property_title" onclick=" ta.setEvtCookie('Reviews', 'HotelName', 297608, 0, this.href); ta.util.cookie.setPIDCookie(15176);">
Eastin Easy Citizen Ahmedabad</a> </div>

<div class="quality wrap">
<a href="/Hotel_Review-g297608-d4464287-Reviews-Eastin_Easy_Citizen_Ahmedabad-Ahmedabad_Gujarat.html" id="property_4464287"class="property_title" onclick=" ta.setEvtCookie('Reviews', 'HotelName', 297608, 0, this.href); ta.util.cookie.setPIDCookie(15176);">
Eastin Easy Citizen Ahmedabad</a> </div>

我尝试了以下

var nS = page.DocumentNode.SelectNodes("//div[@class='quality']//a");
            foreach (HtmlNode linkNode in nS)
            {
                //do something
            }

但我得到NullPointerException任何人都可以帮助我

1 个答案:

答案 0 :(得分:1)

它应该是质量包装而不是质量

"//div[@class='quality wrap']//a"

所以它会是

var hrefList=page.DocumentNode
                 .SelectNodes("//div[@class='quality wrap']//a")
                 .Where(e=>e.InnerText.Trim()=="Eastin Easy Citizen Ahmedabad")
                 .Select(x=>x.Attributes["href"].Value)
                 .ToList();