从HttpWebRequest获取HTMLDocument而不使用HtmlAgilityPack

时间:2015-06-09 17:53:15

标签: html vb.net

我正在尝试使用“HttpWebRequest”而不是浏览器编写一个返回“htmlDocument”的函数,但我仍然坚持使用innerhtml进行传输。

我不明白如何设置“mWebPage”的值,因为VB不接受HTMLDocument的“新建”

我知道我可以使用“HtmlAgilityPack”,但我想测试我当前的代码,只更改Web请求而不是更改所有解析代码。(为此我需要一个HtmlDocument)

在此测试之后,我将尝试更改解析代码。

Function mWebRe(ByVal mUrl As String) As HTMLDocument
    Dim request As HttpWebRequest = CType(WebRequest.Create(mUrl), HttpWebRequest)

    ' Set some reasonable limits on resources used by this request
    request.MaximumAutomaticRedirections = 4
    request.MaximumResponseHeadersLength = 4

    ' Set credentials to use for this request.
    request.Credentials = CredentialCache.DefaultCredentials

    'Here I've tryed many types
    Dim mWebPage As HTMLDocument
    Try
        Dim request2 As HttpWebRequest = WebRequest.Create(mUrl)
        Dim response2 As HttpWebResponse = request2.GetResponse()
        Dim reader2 As StreamReader = New StreamReader(response2.GetResponseStream())
        Dim WebContent As String = reader2.ReadToEnd()

        'This is my last attempt
        'This gives Null Reference Exception
        mWebPage.Body.InnerHtml = WebContent


    Catch ex As Exception
        MsgBox(ex.ToString) 
    End Try

    Return mWebPage
End Function

我尝试了很多方法(也导入了HTML对象库)但没有任何效果:(

2 个答案:

答案 0 :(得分:0)

好的,这一点变得越来越黑了,但这应该有效。

首先,您需要在类级别实例化WebBrowser控件:

Private m_objWebBrowser As WebBrowser

接下来为DocumentCompleted Event添加一个Event Handler,其中包含所有HTML解析数据。您将使用WebBrowser控件的OpenNew方法获取HtmlDocument的实例。

Private Sub HandleParsing(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs)

    'Use your code for generating WebContent.
    Dim WebContent As String = "<html></html>"

    Dim mWebPage As HtmlDocument = DirectCast(sender, WebBrowser).Document.OpenNew(True)

    mWebPage.Write(WebContent)

End Sub

最后,您可以通过连接事件处理程序并导航到磁盘上的某个页面或Html文件(DocumentCompleted异步触发)来触发所有这些:

    AddHandler m_objWebBrowser.DocumentCompleted, AddressOf HandleParsing

    m_objWebBrowser.Navigate("www.google.com")

答案 1 :(得分:0)

我在网上找到了一个解决方案并修改了我的代码,如下所示: 要使其工作,您必须激活对&#34; Microsoft HTML对象库&#34;的引用。 (在.com参考文献中)

它已经过时但似乎是不使用webbrowser制作html文档的唯一方法。

我希望它可以帮助别人。

Function mWebRe(ByVal mUrl As String) As MSHTML.HTMLDocument
    Dim request As HttpWebRequest = WebRequest.Create(mUrl)
    Dim doc As MSHTML.IHTMLDocument2 = New MSHTML.HTMLDocument

    ' Set some reasonable limits on resources used by this request
    request.MaximumAutomaticRedirections = 4
    request.MaximumResponseHeadersLength = 4

    ' Set credentials to use for this request.
    request.Credentials = CredentialCache.DefaultCredentials

    Try
        Dim response As HttpWebResponse = request.GetResponse()
        Dim reader As StreamReader = New StreamReader(response.GetResponseStream())
        Dim WebContent As String = reader.ReadToEnd()

        doc.clear()
        doc.write(WebContent)
        doc.close()

        'To make sure that the data is fully load.
        While (doc.readyState <> "complete")
            'This for more waiting (if needed)
            'System.Threading.Thread.Sleep(1000)
            Application.DoEvents()
        End While
    Catch ex As Exception
        MsgBox(ex.ToString)
    End Try

    Return doc
End Function