使用agilityhtml获取div的特定部分的值

时间:2016-01-20 13:31:43

标签: c# html html-parsing html-agility-pack

我正在尝试使用agilitypack.my获取div的值.html代码是这样的:

<div class="div_5">
                <p>First Paragraph</p>
                <p>Second Paragraph</p>
                <p>Third Paragraph</p>
                <p>Fourth Paragraph</p>

<div class="div_6">
                <p>First Paragraph</p>
                <p>Second Paragraph</p>
                <p>Third Paragraph</p>
                <p>Fourth Paragraph</p>
     </div>
                <p>other Paragraph</p>
                <p>other Paragraph</p>
  </div>

我需要div_5的内容而不包含div_6的内容,所以我使用此代码:

    newsContent.Content = resultat1.DocumentNode.SelectSingleNode("//div[@class='div_5']").InnerHtml;

但此代码包含div_5div_6。如何从我的值中移除div_6

3 个答案:

答案 0 :(得分:1)

最终代码:

HtmlNode doc = resultat1.DocumentNode.SelectSingleNode("//div[@class='div_5']");
                    HtmlNode node = doc.SelectSingleNode("//div[@class='div_6']");
                    node.ParentNode.RemoveChild(node);

答案 1 :(得分:0)

先删除innernode,然后继续。

var yourNode = resultat1.DocumentNode.SelectSingleNode("//div[@class='div_5']")
var toBeRemoved = resultat1.DocumentNode.SelectSingleNode ("//div[@class='_div_6']");

yourNode.RemoveChild(toBeRemoved,false);
//proceed with your code
newsContent.Content = yourNode.InnerHtml;  

答案 2 :(得分:-1)

我从未使用过AgilityHTML,但尝试了以下几点:

var div5 = resultat1.DocumentNode.SelectSingleNode("//div[@class='div_5']");

var div6 = div5.DocumentNode.SelectSingleNode("//div[@class='div_6']");

div6.Remove();

newsContent.Content = div5.InnerHtml;