如何使用HTMLAgilitypack循环遍历HTML的每个节点并删除某些节点?

时间:2013-08-27 11:05:11

标签: c# dom html-agility-pack

我需要识别每个节点并删除某些节点,例如Plegend等。 需要使用HTMLAgilityPack循环遍历html,

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><META content="IE=5.0000" http-equiv="X-UA-Compatible">

<META http-equiv="Content-Type" content="text/html; charset=windows-1252">
</HEAD>
<BODY bgcolor="white"><text><TITLE>ABCD</TITLE> 

<P style="page-break-before: always;">
<HR width="100%" size="3" align="CENTER" style="color: rgb(153, 153, 153);">

<fieldset>
    <legend>Personalia:</legend>
    Name: <input type="text"><br>
    Email: <input type="text"><br>
    Date of birth: <input type="text">
  </fieldset>

<P style="margin-top: 0px; margin-bottom: 0px;"><FONT size="1">&nbsp;</FONT></P>
<P align="center" style="margin-top: 0px; margin-bottom: 0px;"><FONT size="2" 
style="font-family: Times New Roman;">B-17 </FONT></P></text>

</BODY></HTML>

1 个答案:

答案 0 :(得分:0)

我只是举个例子,试试那个

                String content = "Your Html page source as string";
                HtmlNode.ElementsFlags.Remove("form");
                HtmlDocument doc = new HtmlDocument();
                doc.LoadHtml(content);

                // Pass the name of the tag you want to remove 
                DeleteTagByName("tagname",doc);
                public void DeleteTagByName(string name, HtmlDocument HtmlDocument)
                 {
                     HtmlDocument.DocumentNode.SelectSingleNode("//input[@name='" + name + "']").Remove();

                }