HtmlAgilityPack SelectSingleNode返回没有InnerHtml的HtmlNode

时间:2016-09-09 12:17:48

标签: c# html dom xpath html-agility-pack

我对SelectSingleNode方法有点困惑。 我传递给它一个简单的xpath表达式,并希望获得具有完整内容的节点,包含所有嵌套节点,但实际上只检索一个html标签,我正在查找没有任何内部和外部文本,并且节点确实不包含任何孩子。

Xpath的:

//form

这是html:

<HTML>
<BODY>
<FORM METHOD="POST" ACTION="https://test.com/action">
<INPUT TYPE="hidden" NAME="attribute1" VALUE="some value"/>
<INPUT TYPE="hidden" NAME="attribute2" VALUE="another value"/>
</FORM>
</BODY>
</HTML>

还有一种方法:

    public List<Parameter> CollectFilledInputsFromResponseForm(IRestResponse response, string formXpath)
    {
        var responseAsHtml = new HtmlDocument();
        responseAsHtml.LoadHtml(response.Content);
        var formDoc = responseAsHtml.DocumentNode.SelectSingleNode(formXpath);

        if (formDoc == null)
            throw new Exception(string.Format("No form found for '.{0}' xPath", formXpath));

        var formHtml = new HtmlDocument();
        formHtml.LoadHtml(formDoc.OuterHtml);
        var inputs = formHtml.DocumentNode.SelectNodes("//input");

        var parameters = new List<Parameter>();
        foreach (var input in inputs)
        {
            var name = input.GetAttributeValue("name", "Name not found");
            var value = input.GetAttributeValue("value", "Value not found");

            if (name.Equals("Name not found") || value.Equals("Value not found"))
                continue;

            parameters.Add(new Parameter(){Name = name,Value = value,Type = ParameterType.GetOrPost});
        }

        return parameters;
    }

Locals Screenshot

请建议。

1 个答案:

答案 0 :(得分:2)

  

在加载文档之前执行HtmlNode.ElementsFlags.Remove("form");

请参阅https://stackoverflow.com/a/4219060/4033466