HtmlAgilityPack - 确保CanOverlap和Closed同时关闭

时间:2017-01-13 10:10:26

标签: c# html html-agility-pack

如何定义HtmlNode.ElementsFlags["div"]可以CanOverlap并且必须同时Closed

我有这个HTML(右侧结构):

<p>
    <div>
        <b>text:</b> 
        <img alt="" src="#" style="BORDER: 0px solid; ">
    </div>
    <div>
        <b>text:</b> 
        <div></div>
        <div></div>
        <p>text</p>
    </div>
</p>

我需要确保所有标记都已正确打开和关闭,我正在使用HtmlAgilityPack来执行此操作。但是HtmlAgilityPack正在更改我的HTML,因为它没有假设标记为CanOverlap

HtmlAgilityPack返回的HTML(错误的结构):

<p>
   <div>
      <b>text:</b>
      <img alt="" src="#" style="BORDER: 0px solid; " />
   </div>
   <div />
   <b>text:</b>
   <div />
   <div>
        <p>
            text
        </p>
   </div>
</p>

我该如何解决这个问题?如何判断HtmlAgilityPack代码CanOverlap并确保代码为Closed

C#代码

if (!HtmlNode.ElementsFlags.ContainsKey("p"))
    HtmlNode.ElementsFlags.Add("p", HtmlElementFlag.Closed);
else
    HtmlNode.ElementsFlags["p"] = HtmlElementFlag.Closed;

if (!HtmlNode.ElementsFlags.ContainsKey("span"))
    HtmlNode.ElementsFlags.Add("span", HtmlElementFlag.Closed);
else
    HtmlNode.ElementsFlags["span"] = HtmlElementFlag.Closed;

if (!HtmlNode.ElementsFlags.ContainsKey("div"))
    HtmlNode.ElementsFlags.Add("div", HtmlElementFlag.Closed);
else
    HtmlNode.ElementsFlags["div"] = HtmlElementFlag.Closed;

var htmlDoc = new HtmlDocument();
htmlDoc.OptionFixNestedTags = true;
htmlDoc.OptionWriteEmptyNodes = true;
htmlDoc.LoadHtml(myHtml);

var htmlError = htmlDoc.ParseErrors.SafeAny();

if (!htmlError)
    myHtml = htmlDoc.DocumentNode.InnerHtml;

1 个答案:

答案 0 :(得分:0)

解决!我们可以说HtmlNode.ElementsFlags应为ClosedCanOverlap,如下所示:

if (!HtmlNode.ElementsFlags.ContainsKey("div"))
    HtmlNode.ElementsFlags.Add("div", HtmlElementFlag.CanOverlap & HtmlElementFlag.Closed);
else
    HtmlNode.ElementsFlags["div"] = HtmlElementFlag.CanOverlap & HtmlElementFlag.Closed;