C#HtmlAgilityPack处理HtmlNodeCollection

时间:2017-09-05 19:24:47

标签: c# html-agility-pack

这是html代码:

<div id="frmPnlProductGallery">
        <ul class="clearfix">
                <li>
                    <a data-index="0" class="productItem" href="javascript:void(0)" title="https://file.digi-kala.com/digikala/Image/Webstore/ProductPhoto/P_118274/Original/234942.jpg" rel="dk-gallery-item" data-imgurl="https://file.digi-kala.com/digikala/Image/Webstore/ProductPhoto/P_118274/Original/234942.jpg">
                    </a>
                </li>
                <li>
                    <a data-index="1" class="productItem" href="javascript:void(0)" title="https://file.digi-kala.com/digikala/Image/Webstore/ProductPhoto/P_118274/Original/c9ebc3.jpg" rel="dk-gallery-item" data-imgurl="https://file.digi-kala.com/digikala/Image/Webstore/ProductPhoto/P_118274/Original/c9ebc3.jpg">
                    </a>
                </li>
                <li>
                    <a data-index="2" class="productItem" href="javascript:void(0)" title="https://file.digi-kala.com/digikala/Image/Webstore/ProductPhoto/P_118274/Original/12199f.jpg" rel="dk-gallery-item" data-imgurl="https://file.digi-kala.com/digikala/Image/Webstore/ProductPhoto/P_118274/Original/12199f.jpg">
                    </a>
                </li>
        </ul>
    </div>

现在我想将三个title=抓到List<string> 这是代码:

var lis = htmlDoc.DocumentNode.SelectNodes("//div[@id='frmPnlProductGallery']//ul//li");
List<string> ls_images = new List<string>();

现在我怎么能抓住这三个头衔?

2 个答案:

答案 0 :(得分:1)

你可以在这里使用Linq。例如

document.DocumentNode.Descendants("a").Where(_ => _.HasClass("productItem")).Select(_ => _.GetAttributeValue("title", ""));

这是HasClass扩展方法:

public static bool HasClass(this HtmlNode node, params string[] classValueArray)
{
     var classValue = node.GetAttributeValue("class", "");
     var classValues = classValue.Split(' ');
     return classValueArray.All(c => classValues.Contains(c));
}

答案 1 :(得分:1)

/ a 添加到您的xpath并选择标题属性

List<string> ls_images = htmlDoc.DocumentNode
      .SelectNodes(@"div[@id='frmPnlProductGallery']/ul/li/a")
      .Select(x => x.GetAttributeValue("title", string.Empty))
      .ToList();