XMLHTTP60请求未显示整个HTML文档

时间:2018-08-03 19:59:47

标签: html excel vba parsing web-scraping

我正在尝试从网站获取HTML文档,以其他方式获取数据!

不幸的是,我无法获得与网页相关的整个HTML文档。我的debug.print语句没有显示我想要的整个网页,它被切断了。我是编程的新手,将不胜感激!

我的代码如下:

Const SecForm4 As String = "https://www.secform4.com/significant-buys.htm"

Sub LoadWebPage()

    Dim XMLReq As New MSXML2.XMLHTTP60

    XMLReq.Open "GET", SecForm4, False
    XMLReq.send

    If XMLReq.Status <> 200 Or XMLReq.readyState <> 4 Then
        MsgBox "Problem" & vbNewLine & XMLReq.Status & "-" & XMLReq.statusText
        Exit Sub
    End If

    ParsingHTMLDocument XMLReq.responseText

End Sub

Sub ParsingHTMLDocument(HTMLText As String)

    Dim HTMLDoc As New MSHTML.HTMLDocument

    HTMLDoc.body.innerHTML = HTMLText
    Debug.Print HTMLText

End Sub

1 个答案:

答案 0 :(得分:0)

以下是在获取文档和表格方面的工作。您不太可能将整个文档打印到立即窗口,因为它对capacity有限制。相反,您可以写入文本文件并检查。

将文件路径"C:\Users\User\Desktop\Test.txt"更改为一个。

Option Explicit
Public Sub GetInfo()
    Dim sResponse As String, i As Long, html As New HTMLDocument, hTable As HTMLTable
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://www.secform4.com/significant-buys.htm", False
        .Send
        sResponse = StrConv(.responseBody, vbUnicode)
    End With
    sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))
    WriteTxtFile sResponse
    With html
        .body.innerHTML = sResponse
        Set hTable = .getElementById("filing_table")
        MsgBox hTable.localName
    End With
End Sub

 Public Sub WriteTxtFile(ByVal aString As String, Optional ByVal filePath As String = "C:\Users\User\Desktop\Test.txt")
    Dim fso As Object, Fileout As Object
    Set fso = CreateObject("Scripting.FileSystemObject")
    Set Fileout = fso.CreateTextFile(filePath, True, True)
    Fileout.Write aString
    Fileout.Close
End Sub

需要引用HTML对象库。