使用HtmlAgilityPack和VB.NET从html页面中提取所有表单及其所有输入

时间:2015-02-23 19:57:00

标签: c# html vb.net xpath html-agility-pack

如何使用HtmlAgilityPack及其所有输入字段从页面获取所有表单。我有以下代码:

Try
            Dim Web As New System.Net.WebClient
            Dim htdoc As New HtmlAgilityPack.HtmlDocument
            Dim page As String = ""
            page = Web.DownloadString("http://test.com/student/register.php")
            htdoc.LoadHtml(page)
            For Each item As HtmlNode In htdoc.DocumentNode.Descendants("form")
                    MsgBox(item.OuterHtml)
            Next

        Catch ex As Exception
        End Try

html页面的示例:

<HTML>
<BODY>
<STRONG>TEST</STRONG>
<STRONG>TEST</STRONG>
<STRONG>TEST</STRONG>
<form action="register.php?do=checkdate" method="post" onsubmit="return checkform(this)">
<input type="hidden" name="do" value="checkdate">
<input type="hidden" name="s" value="X">
<input type="hidden" name="securitytoken" value="guest">
<input type="hidden" name="url" value="https://www.google.com/">
</form>
<STRONG>TEST</STRONG>
<STRONG>TEST</STRONG>
<STRONG>TEST</STRONG>
<form action="index.php" method="get" style="clear:left">
<td class="tfoot" align="right" width="100%">
        <div class="smallfont">
            <strong>
                <a href="https://support.test.com/?s=5413e493b8e32114ba2ff4d2e038ace3" rel="nofollow" accesskey="9">Contact Us</a> -
                <a href="http://store.test.com/">Steam Store</a> -

            <a href="archive/index.php">Archive</a> -
                <a href="http://store.test.com/privacy_agreement/">Privacy Statement</a> -
                <a href="http://store.test.com/subscriber_agreement/">Terms of Service</a> -
                <a href="#top" onclick="self.scrollTo(0, 0); return false;">Top</a>
            </strong>
        </div>
</td>
</form>
<STRONG>TEST</STRONG>
<STRONG>TEST</STRONG>
<STRONG>TEST</STRONG>
<form id="searchform" name="searchform" method="get" action="http://store.test.com/search/" onsubmit="return SearchSuggestCheckTerm(this);">
</form>
</BODY>
</HTML>

执行此代码后我得到的是没有后代的表单标记。

0 个答案:

没有答案