如何在源文件没有网页内容的情况下从Web抓取数据

时间:2018-01-24 18:01:36

标签: vba web-scraping

我试图从here抓取数据,但我面临的问题是其源代码不包含网页上提供的内容。我相信它的脚本。

我如何得到它?我得到了使用硒的建议?我可以从你们那里获得任何其他建议,这将是非常有帮助的。感谢。

With xhr

            .Open "GET", URL, False
            .send

            If .readyState = 4 And .Status = 200 Then
                Set internetdata = New MSHTML.HTMLDocument
                internetdata.body.innerHTML = .responseText
                htmlT = internetdata.body.outerHTML


            Else
                MsgBox "Error" & vbNewLine & "Ready state: " & .readyState & _
                vbNewLine & "HTTP request status: " & .Status
            End If

        End With

从这段代码(字符串)我试图在网页上获得所有文本。但是没有得到所有内容。

1 个答案:

答案 0 :(得分:1)

试试这个。它应该获取每个产品的所有描述:

Sub Web_Data()
    Dim IE As New InternetExplorer, html As HTMLDocument
    Dim topic As Object

    With IE
        .Visible = True
        .navigate "http://www.webcollage.net/MainApp/preview-ppp?module=dellbtoc&site=epartner&wcpc=1512144817149&view=live&rcpName=Webcollage"
        While .Busy = True Or .readyState < 4: DoEvents: Wend
        Set html = .document
    End With

    Application.Wait Now + TimeValue("00:00:05") ''if you haven't found your data already, just increase the time

    For Each topic In html.getElementsByClassName("wc-rich-content-description")
        r = r + 1: Cells(r, 1) = topic.innerText
    Next topic

    IE.Quit
End Sub